pwd
stands for "print working directory".
pwd
Change the current working directory:
cd <directory-name>
Create a new directory:
mkdir <directory-name>
Create the file:
touch <file-name>
List the files and directories in the current directory. The -l
flag is used to display the long format, and the -h
flag is used to display the file size in human-readable format.
ls -lh
Copy the file to the destination repository:
cp /path/to/source/repo/file /path/to/destination/repo/
Rename a file in a repository:
mv old-file-name <new-file-name>
Remove all files and subdirectories in the current directory:
rm -rf *
cat
stands for "concatenate". Display the contents of a file.
cat <file-name>
Find the location of a program:
which <program-name>
Return the type of a program:
type <program-name>
Display a line of text/string that is passed as an argument.
echo <text>
It searches the given file for lines containing a match to the given strings or words.
grep "word-to-search" file.txt
The >
operator is used to redirect the output of a command to a file. If the file already exists, it will be overwritten. If the file does not exist, it will be created.
echo "Hello, World!"> myfile.txt
The >>
operator is used to redirect the output of a command to a file. If the file already exists, the output will be appended to the end of the file. If the file does not exist, it will be created.
echo "Hello, World!">> myfile.txt
The |
operator is used to redirect the output of a command to another command. It is called a pipe. The output of the first command is used as the input of the second command.
echo "Hello, World!" | wc
Another way to search text is using the pipe operator with cat. The pipe operator is used to redirect the output of a command to another command.
cat file.txt | grep "word-to-search"
The &&
operator is used to execute a second command after the first command has finished executing. The second command will only be executed if the first command was successful.
echo "Hello, World!" && echo "Hello, World!"
The ;
operator is used to execute a second command after the first command has finished executing. The second command will be executed regardless of whether the first command was successful.
echo "Hello, World!"; echo "Hello, World!"
The &
operator is used to run a command in the background. The command will be executed in the background, and the shell prompt will return immediately.
sleep 10 &
The *
operator is used to match zero or more characters. It is called a wildcard. It can be used to match any file or directory name.
ls *.txt
List all the files installed by a program. The dpkg
is the package management program that installs, removes, and provides information about .deb
packages. The grep
command-line searches through text and prints lines that match a pattern.
dpkg -L <program-name> | grep bin/
netstat is a command-line utility that displays network connections for the Transmission Control Protocol (both incoming and outgoing), routing tables, and a number of network interface (network interface controller or software-defined network interface) and network protocol statistics. The t
flag is used to display TCP connections, the u
flag is used to display UDP connections, the l
flag is used to display only listening sockets, and the n
flag is used to display numerical addresses instead of trying to determine symbolic host, port or user names. grep <port-number>
is used to filter the displayed connections by a specific port number.
netstat -tuln | grep <port-number>
lsof
stands for "list open files". Display information about files opened by processes. The i
flag is used to display only internet connections, the p
flag is used to display the PID and name of the program to which each socket belongs, and the n
flag is used to display numerical addresses instead of trying to determine symbolic host, port or user names.
lsof -i -P -n | grep <port-number>
or
sudo lsof -i :9696
chmod is a command This is the command used to change the file mode bits, which define the permissions of a file or directory. The +x
flag is used to add executable permissions to a file.
chmod +x <file-name>
After this command is executed, we can run the script directly from the command line like this: ./<file-name>.sh
.
Output of <command>
into awk
, which then prints the first field of each line. This is useful for extracting specific columns from a command's output, such as the process ID of a running program. The NR
is a built-in variable in awk that contains the current line number. The >1
is used to skip the first line of the output. We could change 1 to any number to skip the first n lines or print the n column.
<command> | awk 'NR>1{print $1}'
Converts input from standard input into arguments to a command. If we have a column from awk, then this would be transformed into a list of arguments that would be passed to the command.
<command> | xargs <command>
Used to find and display information about processes related to program. ps
is used to display information about active processes, a
Show processes for all users, u
display the process's user/owner and x
Also show processes not attached to a terminal.
ps aux | grep <program-name>
count the number of lines in the file.
wc -l <file-name>
Display the first 10 lines of a file:
head -n 10 <file-name>
The PATH
environment variable contains multiple paths, each separated by a colon (:). The order of the paths matters. When type a command, the shell looks through these paths in the order they're listed. Once it finds a matching executable, it stops searching. If a program isn't running as expected, it might be because a different version earlier in your PATH
is being executed instead.
Display the current value of the PATH
environment variable:
echo $PATH
Add a new directory to the existing PATH environment variable:
export PATH=$PATH:/path/to/directory
Files in ~/.ssh
directory:
-
config : The SSH client configuration file. It contains settings for SSH, such as hostname aliases, specific identity files (private keys) to use for different hosts, user names, port numbers, and other preferences for SSH connections. This file is read by the SSH client to determine how to connect to a particular server and with what credentials.
-
.pub files: pThese are public SSH keys corresponding to their private counterparts (without the .pub extension). They are used in public-key cryptography to securely verify our identity.
Creates a new ssh key, using the provided email as a label with ed25519 algorithm. The -t
flag specifies the type of key to create.
ssh-keygen -t ed25519 -C "[email protected]"
Start the SSH agent using eval
:
eval $(ssh-agent -s)
or using the following command to get the output and then export it:
ssh-agent
OUTPUT
export OUTPUT
Add the SSH private key to the SSH agent. This ensures the SSH agent is aware of the private key and will manage it, making it easier to establish connections without entering a passphrase every time.
ssh-add ~/.ssh/id_ed25519
Remove the SSH private key from the SSH agent:
ssh-add -d key_name
keys the SSH agent is currently holding
ssh-add -l
cluster access
ssh -Y [email protected] -p port
Change password:
passwd
A "node" refers to a single computer or machine within the cluster. Each node can be a separate physical server, or it can be a virtual machine. To change of node in a cluster use:
ssh cn025
List all the nodes in the cluster:
sinfo -l
Displays information about the CPU architecture:
lscpu
Shows the amount of free and used memory in the system .
free -h
Information about the disk space usage on the cluster:
df -h
Gives detailed information about the GPU NVIDIA models, usage, memory:
nvidia-smi
Displays basic information about the system's kernel, operating system, and hardware platform:
uname -a
Commands show the current usage of CPU, memory, and other resources in real-time
top
or more user-friendly interface:
htop
List all the git configurations:
git config --list
git config --global --list
git config --local --list
Set the default branch name to main as in GitHub:
git config --global init.defaultBranch main
Initialize a git repository:
git init
Establish a new remote repository that our local repository can interact with:
git remote add origin <SSH_URL or HTTPS>
Difference between the working directory and the staging area:
git diff
List of news files and modified files:
git status
git branch is used for creating, listing, and deleting branches, while git checkout is used for switching between branches and also for creating a new branch if used with the -b flag.
List all the branches in the repository with -a flag:
git branch -a
or delete a branch with -d (-D to force deletion) flag:
git branch -d <branch-name>
Create and immediately switch to a new branch:
git branch -b <new-branch-name>
or
git checkout <new-branch-name>
Switch to an existing branch:
git checkout <branch-name>
Rename a branch:
git branch -m <old-branch-name> <new-branch-name>
Add a file to the staging area:
git add <file-name>
Remove a file from the staging area before commit:
git reset <file-name>
Commit changes to head:
git commit -m "Commit message"
Stop tracking a file that was previously committed to the repository. It's often used for files that should no longer be part of the repository (e.g., accidentally committed files, files that should be ignored).
git rm --cached <file-name>
Or remove all files from the staging area.The -r flag is for recursive removal, and . indicates the current directory.
git rm --cached -r .
Change the commit message or add/untrack files to last commit (only if not pushed and after staging/unstaging the files with git add
/ git rm
):
git commit --amend -m "New commit message"
Resets the current branch to a specific commit (or to the latest commit of a remote branch like origin/main), discarding all subsequent commits and local changes. This is commonly used to align the branch precisely with a remote branch, erasing any local divergences.
git reset --hard <commit-hash>
Create a new commit that undoes the changes by the specified commit . This effectively exclude the commit from the branch without affecting the subsequent commits:
git revert <commit-hash>
Show the commit history for the currently active branch:
git log
Push the branch to remote repository:
git push origin <branch-name>
Pull changes from the remote repository to the local repository:
git pull origin <branch-name>
Apply local commits on top of the remote branch's commits:
git pull --rebase origin main
Fetch the changes from the remote repository to the local repository:
git fetch origin <branch-name>
Merge the specified branch into the current branch:
git merge <branch-name>
Rebase the current HEAD onto the specified branch:
git rebase <branch-name>
Create a new commit that undoes all of the changes made in , then apply it to the current branch:
git revert <commit-hash>
Replace the current working directory and staging area with the state of the tree at the given commit:
git reset --hard [commit-hash]
If we only want to reset the staging area and not affect the working directory. This will unstage any changes since the specified commit, but leave the files in the working directory unchanged.
git reset [commit-hash]
Switch a repository's remote URL to use SSH in GitHub:
git remote set-url origin <SSH_URL>
List all the remote repositories:
git remote -v
initializes Git LFS in repository:
git lfs install
Track a file with Git LFS. It adds entries to the .gitattributes
.
git lfs track <file-name>
Untrack a file with Git LFS:
git lfs untrack <file-name>
List all the files tracked by Git LFS:
git lfs ls-files
Install Pipenv to manage project dependencies.
pip install pipenv
Install a specific package and add it to Pipfile. To install the packages from Pipfile, use only pipenv install
.
pipenv install <package-name>
Uninstall a specific package and remove it from Pipfile.
pipenv uninstall <package-name>
Activate the virtual environment associated with your project.
pipenv bash
Show the location of the virtual environment for the project:
pipenv --venv
Remove the virtual environment for the project.
pipenv --rm
Update all packages to their latest versions as specified in Pipfile.
pipenv update
Update a specific package to its latest version as specified in Pipfile.
pipenv lock
Clearing the cache is useful to free up space or ensure that pipenv is using the most recent versions of packages without being influenced by cached data.
pipenv --clear
check for security vulnerabilities in the installed packages:
pipenv check
Shows a graph of your installed dependencies, which can be helpful to see what's installed and how those packages are related
pipenv graph
Lists all packages installed in the virtual environment managed by pipenv
pipenv run pip list
Exit the virtual environment:
exit
pyenv is a tool for managing multiple versions of Python on the same machine. To install a python version:
pyenv install <python-version>
Updates the shim files for all Python executables known to pyenv (i.e., ~/.pyenv/versions//bin/). Run this command after install a new version of Python.
pyenv rehash
A list of all available Python versions can be obtained with:
pyenv install -l
List all the python versions installed by pyenv:
pyenv versions
Sets the specified Python version as the default for your entire user account. Any new shell session will use this version unless overridden by a local setting.
pyenv global <python-version>
Sets the specified Python version for the current directory (project). This version overrides the global setting when you're working in this directory.
pyenv local <python-version>
unistall a python version:
pyenv uninstall <python-version>
Build docker image. The flag -t stand for "tag", and allow give a name to the image. The dot (.) means the current directory, and tells Docker to use the files in the current directory for build the image.
docker build -t <image-name> .
Download docker image:
sudo docker pull <docker-image>
List of docker images:
sudo docker images
Run docker image. The flags -it
, it allows to interact with a command line interface within the Docker container. The -p flag expose the port from the container to the host and the .
specifies the build context to the current directory.
sudo docker run -it -p 9696:9696 <docker-image>:tag .
List all containers (running and stopped). ps
is used to list running process on unix and
-a
stands for 'all' process.
sudo docker ps -a
Inspect a docker object:
sudo docker inspect <image_or_container_id>
stop a docker container:
sudo docker stop <container_id>
Remove a docker container forcefully:
sudo docker rm -f <container_id>
Remove a docker image forcefully:
sudo docker rmi -f <image_id>
Remove all stopped containers, not tagged images, and unused networks and volumes.
docker system prune -a
Login to docker hub:
sudo docker login
Change tag name to match the tag from Docker Hub:
docker tag <local-image-name>:<local-tag> <docker-hub-username>/<repository-name>:<desired-tag>
Push docker image to docker hub. Need to tag the local image with the exact name of the repository on Docker Hub:
docker push usarname/image_name:tag
Remove all docker containers'':
sudo docker ps -a | awk 'NR>1{print $1}' | xargs sudo docker rm -f
Create a cluster with kind
:
kind create cluster
Check with kubectl
that it was successfully created:
kubectl cluster-info
Load a docker image into the cluster:
kind load docker-image
Apply configuration to our cluster from a YAML file. The -f
flag specifies the filename.
kubectl apply -f <filename>.yaml
Delete the Kubernetes resources defined in a given YAML file from our cluster:
sudo kubectl delete -f <filename>.yaml
List all deployments:
kubectl get deployments
Delete a deployment:
kubectl delete deployment <deployment-name>
List all pods:
kubectl get pods
List all services:
kubectl get services
Delete a service:
kubectl delete service <service-name>
Create the HPA (Horizontal Pod Autoscaler) for the deployment:
kubectl autoscale deployment <label-name-pod> --name <hpa-name> --cpu-percent=20 --min=1 --max=3
List all HPA:
kubectl get hpa
Show the details of a HPA:
kubectl describe hpa <hpa-name>
Delete HPA:
kubectl delete hpa <hpa-name>
Show the logs of the Metrics Server. Used to check the operational logs of the Metrics Server to diagnose issues, monitor its activities, or understand its interactions with other components in the cluster.
kubectl logs -n kube-system -l k8s-app=metrics-server
Stop the Kubernetes cluster managed by Kind, including all its control plane and worker nodes.
kind delete cluster --name your-cluster-name
cookiecutter https://github.com/drivendata/cookiecutter-data-science
Convert notebook to markdown:
jupyter nbconvert --to markdown <notebook-name>.ipynb
Create a new jekyll site and serve it locally:
jekyll new <site-name>
Run jekyll server locally:
bundle exec jekyll serve
Convert jupyter notebook to latex and then to pdf:
jupyter nbconvert --to latex <notebook-name>.ipynb
Convert latex to pdf:
xelatex <notebook-name>.tex