Installation | How to use | How to write tests| Code
- Enrico Bocchi
- Diogo Castro
- João Vicente
- Jakub Moscicki
- Enric Tejedor
SWAN (Service for Web-based ANalysis) is a cloud data analysis service developed and powered by CERN that provides Jupyter notebooks on demand. It is based on Jupyter upstream technology but it is deeply integrated with CERN-specific services, e.g., EOS which provides storage to SWAN , CVMFS which is used to retrieve software on the fly. Jupyter notebooks, despite being easily accessible from an intuitive web-based interface, are a complex environment, especially when used together with JupyterHub, custom extensions, external storage backends and computational clusters. This project aims at creating a testing framework for both upstream Jupyter components and SWAN-specific components which will allow the addition of new tests to cover new features of the SWAN service and will be able to run synthetic tests. The testing framework is self contained and includes functional tests as well as performance tests.
This testing framework covers the following components:
Upstream Components :
- configurable-http-proxy
- JupyterHub API
- SQLite database managed by JupyterHub
- SWAN docker containers
CERN-specific components :
- This project assumes a SWAN or ScienceBox setup. ScienceBox contains SWAN and all the other CERN-specific services (i.e., EOS, CERNBox, CVMFS). To install required dependencies:
git clone https://github.com/Divya063/TestingFramework.git
cd TestingFramework
pip3 install -r requirements.txt
There are two testing modes:
- From the host machine (default)
- From containers (sciencebox) - To run the test from user container use the flag
-u
and pass the session name as--session {name}
, e.g---session user2
To run tests from host machine use the command - python3 run.py --configfile [path of yaml file]
, while running this command it is assumed that all the parameters (required to run the tests) are provided in the test.yaml file or in some cases(JupyterHub API) necessary configuration has already been done.
- To run the test from user container use the command given below (there is no need to provide path argument as it will
be already set in test.yaml file.)
python3 run.py -u --session user2 --test storage --configfile test.yaml
Parameters
mount_sanity:
timeout: 8
mountpoints: ['user/u/user2/']
mount:
user_mode : 'sciencebox'
write:
fileSize: 1M
filepath: "eos/user/u/user2/0.txt"
delete:
filepath: "eos/user/u/user2/0.txt"
exists:
filepath: "eos/user/u/user2/0.txt"
throughput:
fileNumber: 10
fileSize: 1M
filepath: "eos/user/u/user2/"
checksum:
fileNumber: 10
fileSize: 1M
filepath: "eos/user/u/user2/"
-
mount_sanity:
- Test file : test_mount_sanity.py
- Use case : Given a list of mount points checks if a mount point is hanging
- To run this test explicitly use -
python3 test_mount_sanity.py --timeout 5 --mount_points user user/u
-
mount:
- Test file : test_mount.py
- Use case : Checks if eos is mounted on host and sciencebox
- There are two testing modes:
- "host"
- "sciencebox"
- To run this test explicitly use -
python3 test_mount.py --mode host
-
throughput:
- Test file : test_throughput.py
- Use case : Benchmark read-write performance and compute the read and write throughput.
- To run this test explicity use -
python3 test_throughput.py --num 10 --file-size 1M --dest eos/user/u/user2
-
checksum :
- Test file : test_checksum.py
- Use case : Calculates the checksum
- To run this test explicity use -
python3 test_checksum.py --num 10 --file-size 1M --dest eos/user/u/user2
- To run the test from user container use the command given below.
python3 run.py -u --session user2 --test CVMFS --configfile test.yaml
Parameters
cvmfs:
mount:
repoName: 'sft.cern.ch'
repoPath: 'cvmfs/sft.cern.ch/'
ttfb:
repoPath: 'cvmfs/sft.cern.ch/'
filePath: 'cvmfs/sft.cern.ch/lcg/lastUpdate'
throughput:
num: 2 #Number of packages you want to read
repoPath: 'cvmfs/sft.cern.ch/'
filePath: 'cvmfs/sft.cern.ch/lcg/releases/'
-
mount :
- Test file : test_mount.py
- Use case : Checks if cvmfs folder is mounted or not
- To run this test explicity use -
python3 test_mount.py --repo sft.cern.ch --path cvmfs/sft.cern.ch/
-
ttfb:
- Test file : test_ttfb.py
- Use case : Evaluates the time needed to get the first byte (TTFB) of a file known to exist (lastUpdate).
- To run this test explicity use -
python3 test_ttfb.py ---repo sft.cern.ch --path cvmfs/sft.cern.ch/lcg/lastUpdate
-
throughput:
- Test file : test_throughput.py
- Use case : Benchmark performance when reading from the repository and compute the read throughput.
- To run this test explicity use -
python3 test_throughput.py --num 2 --repo_path cvmfs/sft.cern.ch --path cvmfs/sft.cern.ch/lcg/releases/
- To run the tests from jupyterhub container use
python3 run.py -u --session {session-name} --test jupyterhub-api --configfile test.yaml
. e.g -python3 run.py -u --session user2 --test jupyterhub-api --configfile test.yaml
- Parameters
Parameters for the test is present in test.yaml which is as follows:
jupyterhub_api:
check_api:
hostname: 'localhost'
port: '443'
token: ""
base_path: ""
verify : False
token:
hostname: 'localhost'
port: '443'
token: ""
users: ['user2']
base_path: ""
verify: False
create_session:
hostname: 'localhost'
port: '443'
token: 'b39639d589c44a2294b3dd1164607287'
users: ['user2']
base_path: ""
params:
TLS: False
LCG-rel: "LCG_95a"
platform: "x86_64-centos7-gcc7-opt"
scriptenv: "none"
ncores: 2
memory: 8589934592
spark-cluster: "none"
delay: 30 #Max
verify: False
check_session:
hostname: 'localhost'
port: '443'
token: ""
users: ['user2']
base_path: ""
verify: False
stop_session:
hostname: 'localhost'
port: '443'
token: ''
users: ['user2']
base_path: ""
verify: False
-
check_api
- Test file : test_check_api.py"
- Use case : Checks hub's sanity by making a GET request to "https://localhost:443/hub/api/". On successful execution response code should be 200.
- To run this test explicity use -
python3 test_check_api.py --token " " --port 443 --base_path ""
-
token
- Test file : test_token.py"
- Use case : Checks token validity by making a GET request to "https://localhost:443/hub/api/users/user{}". On successful execution response code should be 200.
- To run this test explicity use -
python3 test_token.py --port 443 --token " " --users user1 --base_path ""
-
create_session
- Test file : test_create_session.py
- Use case : Checks if session can be created successfully or not.
- parameters :
- params : This data needs to be passed to create a user container
- timedelay : In the process of creating sessions, first the required user's server is requested which is validated by the response code 202, a server is created consequently, which needs to repond within 30s, otherwise the server
will be obliterated. The wait time is maximum 30s, if the server didn't respond within stipulated time, response code 500 will be received.
Example :
TimeoutError: Server at https://172.18.0.15:8888/user/user0/ didn't respond in 30 seconds
- To run the test explicitly use following command:
python3 test_create_session.py --port 443 --token " " --users user2 --json '{"LCG-rel": "LCG_95a", "platform": "x86_64-centos7-gcc7-opt", "scriptenv": "none", "ncores": 2, "memory": 8589934592, "spark-cluster": "none"}' --delay 30 --base_path ""
Before creating servers/sessions make sure -
-
You generate the token by running
jupyterhub token dummy_admin
, add the token inside jupyterhub_config.py file asc.JupyterHub.service_tokens = {'dummy_admin' : '<token_value>'}
and restart the jupyterhub process withsupervisorctl restart jupyterhub
. Also add the token to "token" field present in the yaml file. -
And you have created users as each server is created for one particular user.
- Using curl :
curl -XPOST -v -k https://localhost:443/hub/api/users/user2 -H "Authorization: token {token from yaml file}"
- Python code :
def check_create_users(self): """ Inside container port = 443 Ouside container port = 8443 """ self.log.write("info", "creating users..") print(self.users) for user in self.users: global r try: r = self.session.create_users(user) except Exception as err: self.exit |= 1 self.log.write("error", str(err)) self.log.write("error", str(r)) else: if r.status_code == 201: self.log.write("info", user + " successfully created") else: self.log.write("error", (r.content).decode('utf-8')) self.exit |= 1 self.log.write("info", "Exit code " + str(self.exit)) return self.exit
- Using curl :
-
check_session
- Test file : test_check_session.py
- Use case : Checks if a session is running or not
- To run this test explicitly use -
- For single user -
python3 test_check_session.py --port 443 --token " " --users user1 --base_path ""
- For multiple users -
python3 test_check_session.py --port 443 --token " " --users user0 user1 user2 --base_path ""
- For single user -
-
stop_session
- Test file : test_stop_session.py
- Use case : Checks if a session can be stopped or not
- To run this test explicitly use -
- For single user -
python3 test_stop_session.py --port 443 --token " " --users user1 --base_path ""
- For multiple users -
python3 test_stop_session.py --port 443 --token " " --users user0 user1 user2 --base_path ""
- For single user -
- This test checks the consistency of the sqlite database present inside
srv/jupyterhub/
as jupyterhub.sqlite. - To run this test use
python3 run.py -u --session user2 --test database --configfile test.yaml
Parameters:
database:
token:
path: "jupyterhub.sqlite"
user: "user2"
mode: "active" #active mode
table_name: "api_tokens"
servers:
path: "jupyterhub.sqlite"
user: "user2"
mode: "active" #active mode
table_name: "servers"
spawners:
path: "jupyterhub.sqlite"
user: "user2"
mode: "active" #active mode
table_name: "spawners"
Two modes are there:
- active mode - When the server is active
- delete mode - When the server is removed or deleted
-
token
- Test file : test_token.py
- Use case : Checks the status of "token" table when a session is created or removed.
- To run this test explicitly use -
python3 test_token.py --path jupyterhub.sqlite -d --user user2 --table api_tokens
-
servers
- Test file : test_servers.py
- Use case : Checks the status of "servers" table when a session is created or removed.
- To run this test explicitly use -
python3 test_servers.py --path jupyterhub.sqlite -d --user user2 --table servers
-
spawners
- Test file : test_spawners.py
- Use case : Checks the status of "spawners" table when a session is created or removed.
- To run this test explicitly use -
python3 test_spawners.py --path jupyterhub.sqlite -d --user user2 --table spawners
- To run this test use
python3 run.py --configfile test.yaml
(this test is not meant to be run from the containers)
Parameters:
user_docker:
docker:
container_name: 'jupyter-user2'
timeout: 5
Checks if a container is healthy or not.
- Pending Work
- Filed issues
- Implement healthchecking mechanism similar to liveliness-probes and readiness-probes
Divya Rani
- Email ID: [email protected]
- Linkedin Profile: https://linkedin.com/in/divya-rani-23924481/