# How-to ## Ansible environment There are 2 different environments used to run Ansible: - AWX from which all Ansible playbooks are run - molecule to run tests locally or in gitlab-ci The Ansible reference version is the one that comes with AWX. AWX 14.0.0 comes for example with Ansible 2.9.11. ```{important} Both environments should stay synchronised. When udating AWX, the molecule environment shall be updated as well. The reverse is also true. When adding an external dependency to the molecule environment, it shall be added to AWX if not already present. ``` ### AWX environment The Ansible environment used by AWX is a Python virtual environment part of the [awx docker image](https://gitlab.esss.lu.se/ics-docker/awx). It's installed under `/var/lib/awx/venv/ansible`. To add extra requirements to this environment, update the [awx Dockerfile](https://gitlab.esss.lu.se/ics-docker/awx/-/blob/master/Dockerfile). Tag the repository to release a new image. See {ref}`AWX deployement `. ### Molecule environment The environment used for molecule tests is a conda environment defined in the [conda-environments](https://gitlab.esss.lu.se/ics-infrastructure/conda-environments) repository. To update this environment, update the [molecule_env.yml](https://gitlab.esss.lu.se/ics-infrastructure/conda-environments/-/blob/master/molecule_env.yml) file. Tagging this repository will automatically update all gitlab-runners. Developers have to update manually their local environment using that file. ## EPICS Archiver Appliance ### Updating the Archiver Appliance - Download the new release from - Upload it to Artifactory under (create a directory named after the tag) - Update the `epicsarchiverap_release` and `epicsarchiverap_release_tag` variables in the role under `defaults/main.yml` ### Updating tomcat Tomcat is installed manually because installing via yum will pull java. We want to manage java via another Ansible role. - Download the new version of tomcat from : `apache-tomcat-.tar.gz` - Upload it to Artifactory under - Update the `epicsarchiverap_tomcat_version` variable under `vars/main.yml` in the ics-ans-role-epicsarchiverap role - Sync the following tomcat configuration templates in the role with the files from the archive (under the `conf` directory): - context.xml.j2 - server.xml.j2 - tomcat-users.xml.j2 - web.xml.j2 - Check the list of variables in the `bin/catalina.sh` script from the archive. Update the variables defined in the `setenv.sh.j2` template from the role if required. ## EPICS Controls VLAN When a new Controls VLAN is created (to host IOCs), there are several things to update: - Archiver Appliance: the VLAN broadcast address shall be added to one archiver (`epicsarchiverap_epics_ca_addr_list` variable) - Alarm Server: the VLAN broadcast address shall be added to the alarm server (`epics_ca_addr_list` variable) - Channel Finder: a new RecCeiver shall be deployed on the VLAN to send data to the Channel Finder. See the [recceiver] group. - EPICS Gateways: configure existing gateways to access this network. Create a new one dedicated to this network if required. - LCR workstations (TN only): add the VLAN broadcast address to the `epics_ca_addr_list` variable in the [local_control_room] group or one of its subgroup ## EPICS Gateways ### Channel Access Gateway To deploy a new EPICS CA Gateway, the following variables are required: - `epics_cagateway_client_broadcast` - `epics_cagateway_write`: set to `false` for read-only GW, `all` for read-write See [ics-ans-role-epics-cagateway](https://gitlab.esss.lu.se/ics-ansible-galaxy/ics-ans-role-epics-cagateway) for more information. For [ro-epics-gw-tn] and [rw-epics-gw-tn], `epics_cagateway_client_broadcast` is set to `epics_addr_list_broadcast_tn` which is defined in the [all] group. ### PV Access Gateway To deploy a new EPICS PVA Gateway, the following variables are required: - `pva_gateway_clients` - `pva_gateway_readonly`: set to `true` or `false` - `pva_gateway_servers` - `pva_gateway_static_routes`: list of static routes to access the clients (if required) See [ics-ans-role-pva-gateway](https://gitlab.esss.lu.se/ics-ansible-galaxy/ics-ans-role-pva-gateway) for more information. For [ro-epics-gw-tn] and [rw-epics-gw-tn], the variables `epics_pva_gateway_clients_tn`, `epics_pva_gateway_server_clients_tn` and `epics_pva_gateway_static_routes_ieg` are defined in the [all] group. ## ESS Notify ESS Notify is an application to send notifications to mobile phones. It includes a server that acts as a proxy and a mobile client. ### ESS Notify client The iOS client source can be found here: ### ESS Notify server [ess-notify-server](https://gitlab.esss.lu.se/ics-software/ess-notify-server) is Python web application built with [FastAPI]. The API is exposed via Swagger UI: The application doesn't store any password. Users are authenticated using LDAP. On successful login, a JWT is created and shall be sent in the _Authorization_ header by any further requests. The token has an expire time set to 30 days by default. The value can be changed using the `ACCESS_TOKEN_EXPIRE_MINUTES` variable defined in the [ess_notify_servers group]. Administrators define a list of services using the API (`POST /api/v1/services/`). Each service includes: - id: UUID (Universally unique identifier) - category: name of the service - color: color of the service used to display notifications in the client - owner: responsible for the service Users subscribe to the desired service from the client application. A new notification can be sent using the `POST /api/v1/services/{service_id}/notifications` endpoint. It can be done by any application (logbook, alarm server). This doesn't require any authentication (only the service id is needed), but a check is performed on the IP address. The request shall come from an IP part of the `ALLOWED_NETWORKS` variable defined in the [ess_notify_servers group]. A notification includes: - title: title of the notification - subtitle: extra text - url: optional url to redirect to in the client When a notification is received by the server, a push is sent to all users who subscribed to the service. Note that rate limiting is enabled to prevent a flood of notifications to be sent. This is achieved using traefik. Default values are configured in the [role](https://gitlab.esss.lu.se/ics-ansible-galaxy/ics-ans-role-ess-notify-server/-/blob/master/defaults/main.yml#L33). See and for more information. ### ess-pynotify [ess-pynotify](https://gitlab.esss.lu.se/ics-infrastructure/ess-pynotify) is a Python library to send notifications via the ESS Notify Server. It can be used both as a library and as a cli tool. Refer to the [README.md](https://gitlab.esss.lu.se/ics-infrastructure/ess-pynotify). Sending a notification only requires a `POST` to the `/api/v1/services/{service_id}/notifications` endpoint. It can easily be done using curl. This library is only provided as a convenience and for easier integration. ## Hashicorp vault [WIP] ```{attention} Still a WIP due to the procedures and standards not being completely decided yet. ``` Hashicorp vault secures, stores, and tightly controls access to tokens, passwords, certificates, API keys. You can find our instance here ### How to move CSEntry variables to Gitlab 1. [Move variables to gitlab](#move-variables-to-gitlab) 2. [Delete/clean variables from csentry](#delete-csentry-variables) 3. [Test the migrations](#test-the-migration) #### Move variables to Gitlab The ansible group would usually have a playbook and it is in that playbook we would be migrating our variables to. The corresponding variables found in csentry ansible groups and host variables can be directly copied to `group_vars/group_name.yml` and respective `host_vars/host_name.yml`. If they don't exist; we should create them. See the folder structure down below for an example: ```bash ├── group_vars │ └── group_name.yml ├── host_vars │ ├── host_name1.yml │ └── host_name2.yml . . . . └── molecule └── default └── molecule.yml ``` The variable precedence from highest to lowest in our workflow: * extra_vars in awx * host_vars in playbook * csentry host_vars * group_vars in playbook * csentry group_vars * vars in default/main.yml ```{caution} Please check all hosts in csentry and also the playbook itself for legacy variables that needs to be moved. Don't forget that we also have [technicalnetwork](https://csentry.esss.lu.se/network/groups/view/technicalnetwork),[labnetworks](https://csentry.esss.lu.se/network/groups/view/labnetworks) and [all](https://csentry.esss.lu.se/network/groups/view/all). Which also may contain related variables. There might be several other parent/children groups related to the ansible group variables we need to migrate. ``` #### Delete CSentry variables After all variables have been migrated please delete the variable entries in csentry. #### Test the migration The next step should be testing the migrated variables. We should test it with Molecule and after that also run it against the hosts. Preferably in AWX with `check mode`; which do not apply any changes to the host and will only show you if there are any changes. If there are no changes we can then run it in `run mode`. If you are using hashicorp lookup for vaulted secret variables you will need to change the token for the job template. See [This section](#using-hashicorp-vault-token-in-awx) ### Using hashicorp vaulted secrets ```{hint} For variables that are vaulted and needs to be moved to hashicorp vault please see [this section](#decrypting-secret-variables) for how to decrypt them. ``` Here is an example for how to use a hashicorp vaulted variable will look like: ```yaml test_postgres_user: "{{ lookup('hashi_vault', 'secret=secret/data/tn/test-app/postgres:data')['user'] }}" test_postgres_password: "{{ lookup('hashi_vault', 'secret=secret/data/tn/test-app/postgress:data')['password'] }}" ``` These can be added to `group_vars` and `host_vars` but might cause an issue while testing it with molecule. Thus we will need to overwrite the vaulted variables in `molecule.yml` like this: ```yaml # molecule/default/molecule.yml provisioner: name: ansible inventory: group_vars: group_name: test_postgress_password: "" ``` As such molecule will be using these variables instead. Here is a more extensive example of how variables may look like: ```yaml provisioner: name: ansible inventory: group_vars: group_name: test_app: test_edition: sigma test_version: 1.2.4 host_vars: host_name1: test_variablelist: - list1 - list2 test_variable: test: "hey" host_name2: test_edition: alpha ``` The variable precedence from highest to lowest with molecule: * host_vars in molecule.yml * host_vars in playbook * group_vars in molecule.yml * group_vars in playbook * vars in role default/main.yml #### Creating and using Hashicorp vault secrets We can manually add secrets in Hashicorp vault From the secret tab we can access our "secret engine" `secrets` and from there create a new secret var with "Create secret" button. The secret var is then added in the path specified with the key:value pair. For example if we look at the [ics-ans-olog-es](https://gitlab.esss.lu.se/ics-ansible-galaxy/ics-ans-olog-es) playbook we can see in one of the variables how the lookup for the secret var is written. ```yaml ldap.manager.password: "{{ lookup('hashi_vault', 'secret=secret/data/tn/olog-es/ldap:data')['password'] }}" ``` Lets look further into what's written: ` 'secret=secret/data/tn/olog-es/ldap:data')['password']` The first part of the path `secret` is our secret engine. Then `data` for the content in there. `tn` is the network domain. Then `olog-es` our application/group name. `ldap` is the category we have created to store our values. (It could be `database`, `token` or anything else). Then the `data` inside `ldap`. Lastly we have the key:value pair in []. In this case `password`. We could also store `user` and any other pair here. For example `'secret=secret/data/tn/olog-es/postgres:data')['user']` and `'secret=secret/data/tn/olog-es/postgres:data')['password'].` The path can actually be written however one wants but for consistency we decided with this. At the moment our secrets are stored based on network zones for example TN, GPN, CSlab and then application/group name. This might be changed in the future. > **Note:** > We should keep production and staging secret vars separate for best practice and create new ones if they are not. For more examples of how everything is set up please see these roles and their corresponding vaulted secret vars. * * * ##### Certificate role [WIP] If our playbook is running the [ics-ans-role-certificate](https://gitlab.esss.lu.se/ics-ansible-galaxy/ics-ans-role-certificate) role we need to include this to our group_vars to be able to make a certificate signing request. Add these to our group_vars/group_name.yml ```yaml certificate_adcs_username: "{{ lookup('hashi_vault', 'secret=secret/data/gpn/adcs/cert-request-user:data')['user'] }}" certificate_adcs_password: "{{ lookup('hashi_vault', 'secret=secret/data/gpn/adcs/cert-request-user:data')['password'] }}" ``` ___ If our host is **not** "tn.esss.lu.se" or "cslab.esss.lu.se". We need to include this to the host_vars to fetch the correct certification Add these to our host_vars/host_name.yml ```yaml certificate_custom_key: "{{ lookup('hashi_vault', 'secret=secret/data/gpn/certificate/esss.lu.se:data')['key'] }}" certificate_custom_certificate: "{{ lookup('hashi_vault', 'secret=secret/data/gpn/certificate/esss.lu.se:data')['certificate'] }}" certificate_custom_certificate_chain: "{{ lookup('hashi_vault', 'secret=secret/data/gpn/certificate/esss.lu.se:data')['certificate-chain'] }}" ``` ##### Using Hashicorp vault token in AWX We will need correct access to lookup our secret variables in hashicorp vault. This can be done with a small modification to the job template in AWX. or else you might get errors like these during run. #### Decrypting secret variables ```{hint} The ansible vaulted variable can usually be found in bitwarden. ``` To decrypt an ansible vaulted variable we would need a vault password which can be found in bitwarden under **Ansible vault password**. If you haven't set the `ANSIBLE_VAULT_PASSWORD_FILE` we will need to input the vault password manually during the decrypt process when it prompts you. We will also need to change the format of the text to something like the examples below. ```bash $ ansible-vault decrypt - sample output - Vault password: Reading ciphertext input from stdin $ANSIBLE_VAULT;1.1;AES256 39656637333832303638363264393033653433346634356438316636643964666332373630356564 3461316164666139626630343930376233363832653064310a303631306235333365666463393834 38643330306534333065323033643838303338386664353637653131346530623836393366346430 3565663666663539370a623665353663383563643961633761303932616564313662663066623831 3064 #Press Enter once then Ctrl + D Decryption successful test ``` This will also work, without newlines between the hashed numbers. ```bash $ ansible-vault decrypt - sample output - Vault password: Reading ciphertext input from stdin $ANSIBLE_VAULT;1.1;AES256 396566373338323036383632643930336534333466343564383166366439646663323736303565643461316164666139626630343930376233363832653064310a303631306235333365666463393834386433303065343330653230336438383033383866643536376531313465306238363933663464303565663666663539370a6236653536633835636439616337613039326165643136626630666238313064 #Press Enter once then Ctrl + D Decryption successful test ``` ```{important} There needs to be a newline between `$ANSIBLE_VAULT;1.1;AES256` and the hashed numbers. Also notice there are no single quotation marks between the numbers. ``` ### FAQ I get this error during run. ```json { "msg": "An unhandled exception occurred while templating '{{ lookup('hashi_vault', 'secret=secret/data/gpn/adcs/cert-request-user:data')['user'] }}'. Error was a , original message: An unhandled exception occurred while running the lookup plugin 'hashi_vault'. Error was a , original message: No Vault Token specified", "_ansible_no_log": false } ``` See [This section](#using-hashicorp-vault-token-in-awx) ___ ## Monitoring using CLI tools [alerta](https://alerta.io/) is the unified command-line tool, terminal GUI and Python SDK for the alerta monitoring system. It can be used to send, query, tag, change status, delete, dump historical data or view raw alert data. It can also be used to send heartbeats to the alerta server, and generate alerts based on missing or slow heartbeats. The client tool can be installed through the alerta.yml located in the [conda-environments](https://gitlab.esss.lu.se/ics-infrastructure/conda-environments) repository. Setup the configuration file `~/.alerta.conf` ```ini [DEFAULT] timezone = Europe/Stockholm output = json profile = production [profile production] endpoint = https:///api key = sslverify = no timeout = 10.0 debug = no ``` Set the following environment variables: ```bash export ALERTA_CONF_FILE=~/.alerta.conf export ALERTA_DEFAULT_PROFILE=production echo "export ALERTA_CONF_FILE=~/.alerta.conf" >> ~/.bashrc echo "export ALERTA_DEFAULT_PROFILE=production" >> ~/.bashrc source ~/.bashrc ``` To display alerts in "top" UNIX output format `alerta top` See the [alerta CLI how to guide](https://docs.alerta.io/cli.html) ## GitLab Runners The GitLab runners provided by CSI are shared runners. Tags are used to allocate a job to a runner. Anyone can use a runner by specifying the proper tag. Note that in some case, runners could be restricted to a specific project or group. This would make sense for `molecule` runners for example. To add a new GitLab runner, create a new VM and add it to the [gitlab_runner](https://csentry.esss.lu.se/network/groups/view/gitlab_runner) group. By default, two executors are deployed as defined by the `gitlab_runner_to_register` variable in that group. - a docker executor with the `docker` tag - a shell executor with the `molecule` and `packer` tags If you want to deploy something else, you can overwrite this variable at the host level. See [gitlab-runner01](https://csentry.esss.lu.se/network/hosts/view/gitlab-runner-01) for example that was only defined with the docker executor and the `xilinx` tag. ```{warning} The current [ics-ans-role-gitlab-runner](https://gitlab.esss.lu.se/ics-ansible-galaxy/ics-ans-role-gitlab-runner) Ansible role only registers a runner if not only registered. Modifying the runner configuration won't trigger any update. The runner has to be [unregistered](https://docs.gitlab.com/runner/commands/#gitlab-runner-unregister) or the configuration modified manually. ``` The gitlab-runner configuration is stored under `/etc/gitlab-runner/config.toml`. See [GitLab Runner commands](https://docs.gitlab.com/runner/commands/README.html) and {ref}`GitLab Runner deployement ` for more information. ## Local Control Room (LCR) The main Ansible group for the LCR workstations is the [local_control_room] group, where global settings are defined. Each LCR workstation should be part of that group or a subgroup. The workstations are indeed divided in several subgroups: - [lcr_cryo]: Cryo workstations - [lcr_dev]: Development workstation used by SW team (located in the Office) - [lcr_operations]: Operations workstations - [lcr_ts2]: TS2 workstations Those subgroups were created to allow different settings on the workstations, especially the `epics_ca_addr_list` variable to give access to different networks. They have also been used to deploy different versions of CS-Studio (upgrade some subgroups first). Note that global settings for OpenXAL or Phoebus shall be defined in the application group: [openxal] or [phoebus]. The [local_control_room] group is a child of those groups. The LCR workstations are deployed using several playbooks: - [update-centos](https://torn.tn.esss.lu.se/#/templates/job_template/251): only run after OS installation to install NVIDIA drivers - [deploy-local-control-room](https://torn.tn.esss.lu.se/#/templates/job_template/78): to install all SW on the workstation, except Phoebus - [deploy-phoebus](https://torn.tn.esss.lu.se/#/templates/job_template/428): to deploy Phoebus Each template can be run on a specific workstation, the full [local_control_room] group or a subgroup. ## Moving a service from GPN to the DMZ Moving a service requires to create a new VM and deploy it from scratch using Ansible. If the service uses a database or local data, a backup has to be restored. ```{note} DNS on `esss.lu.se` domain is managed by IT. There is no link with CSEntry. Updating hosts in CSEntry on the [InitialOP-DMZ](https://csentry.esss.lu.se/network/networks/view/InitialOP-DMZ) or [pr-srv-esss-lu-se](https://csentry.esss.lu.se/network/networks/view/pr-srv-esss-lu-se) networks has no impact on DNS. ``` The following describes the steps performed to move the csentry-test server as an example: - Register the new host in CSEntry: Note that the old host could be moved to the new network, but a random MAC has to be generated. It's easier to delete the old interface and create a new one, or create a new host (if the same hostname is used, previous entry has to be deleted). In this case `csentry-test` was a cname that was deleted before to register the new host. - Trigger the new VM creation from CSEntry - Deploy the new VM. If a static inventory is used, update it and specify the new IP using the `ansible_host` variable. This is automatically done when using csentry inventory (the DNS doesn't need to be up-to-date). - Stop the old service. For CSEntry, stopping traefik is enough: `sudo docker stop traefik_proxy` - Run a backup to get the latest state. For CSEntry: `sudo /usr/local/sbin/dump-db` - Copy the result `/dumps/csentry_db-20201126-1426.sql.gz` to the new machine - Restore the backup on the new machine. For CSEntry: `sudo /usr/local/sbin/restore-db /tmp/csentry_db-20201126-1426.sql.gz` - After restoring the backup, some extra actions might have to be performed. For CSEntry, the elasticsearch database needs to be synchronized with the data restored in postgres. Run: `sudo docker exec csentry_web flask reindex`. Or stop the _csentry_web_ container and re-launch the playbook for the handler to be triggered. - Ask IT to update the DNS - The new service is up and running! - Create a SNOW ticket to delete the old GPN VM ## Python deployment There are many ways to deploy Python applications/tools. ### OS package If the tool is available as a RPM, this is the easiest way to deploy it. Use `yum` in that case. This solution also works to deploy a simple Python script with dependencies available via the OS. An example is the `slackalarm` Python script. See the [playbook](https://gitlab.esss.lu.se/ics-ansible-galaxy/ics-ans-slackalarm/-/blob/master/playbook.yml). ### Pex / Shiv For command line tools that are only available from [pypi] or require recent dependencies, [pex] and [shiv] are good options. They both build fully self-contained Python zipapps with all their dependencies included. [shiv] requires Python >= 3.6. This is the recommended utility for Python 3 compatible tools. If Python 2 is required, use [pex]. `ansible-galaxy` is packaged with [pex] for inclusion in the Development Machine (RPM wasn't available at the time). See . Docker images are available to easily create a GitLab CI pipeline based on those utilities. See [awx-shiv](https://gitlab.esss.lu.se/ics-infrastructure/awx-shiv) as example. ### Docker Docker is a good alternative to run larger applications. This is the solution used for [CSEntry](https://gitlab.esss.lu.se/ics-infrastructure/csentry) or [galaxy-bot](https://gitlab.esss.lu.se/ics-infrastructure/galaxy-bot). This is also a solution used for tools required by GitLab CI. [awxkit image](https://gitlab.esss.lu.se/ics-docker/awxkit) is used by many pipelines to trigger AWX jobs. ### Conda environment To install multiple libraries or tools that require non Python dependencies (like epics-base), an alternative is to use a conda environment. Conda allows us to choose the Python version and to install extra requirements. There are two Ansible modules to manage conda environments: - [conda module](https://gitlab.esss.lu.se/ics-ansible-galaxy/ics-ans-role-conda/-/blob/master/library/conda.py) - [conda_env module](https://gitlab.esss.lu.se/ics-ansible-galaxy/ics-ans-role-conda/-/blob/master/library/conda_env.py) This solution is used for example to deploy: - epics-base in the LCR - EPICS CA and PVA Gateways [all]: https://csentry.esss.lu.se/network/groups/view/all [local_control_room]: https://csentry.esss.lu.se/network/groups/view/local_control_room [lcr_cryo]: https://csentry.esss.lu.se/network/groups/view/lcr_cryo [lcr_dev]: https://csentry.esss.lu.se/network/groups/view/lcr_dev [lcr_operations]: https://csentry.esss.lu.se/network/groups/view/lcr_operations [lcr_ts2]: https://csentry.esss.lu.se/network/groups/view/lcr_ts2 [openxal]: https://csentry.esss.lu.se/network/groups/view/openxal [phoebus]: https://csentry.esss.lu.se/network/groups/view/phoebus [recceiver]: https://csentry.esss.lu.se/network/groups/view/recceiver [pypi]: https://pypi.org [pex]: https://pex.readthedocs.io/ [shiv]: https://shiv.readthedocs.io/ [ro-epics-gw-tn]: https://csentry.esss.lu.se/network/hosts/view/ro-epics-gw-tn [rw-epics-gw-tn]: https://csentry.esss.lu.se/network/hosts/view/rw-epics-gw-tn [fastapi]: https://fastapi.tiangolo.com/ [ess_notify_servers group]: https://csentry.esss.lu.se/network/groups/view/ess_notify_servers