CyberSight is our final project at Bar-Ilan University, developed as part of our academic degree in the Department of Information Sciences. After submitting the project, I continued to work on it independently.
Deepfakes, realistic synthetic media generated using machine learning algorithms, have become a significant concern in recent years. They can be employed to spread disinformation or for malicious purposes, and it's often difficult to discern what's real and what's not. In response, we have developed an application capable of identifying whether a picture of a human face is real or fake.
- Technologies Used
- General Folder Structure
- Prerequisites
- Setup
- Running the Project
- Databases UI Access
- About the Identification Technique
- Project Members
- Frontend: Angular
- Backend: NestJS, and Flask
- Databases: PostgreSQL, and MongoDB (using Prisma ORM)
- Monorepo Management: Turborepo
In addition, we used Dev Containers for the development process. This approach enables us to run the entire project in a Docker container, including all the necessary dependencies and configurations.
Notable JS Libraries:
- typescript
- eslint
- prisma
- joi
- sharp
- passport
- argon2
├── .devcontainer //Dev Containers configuration files and environment variables for development
├── apps
│ ├── client //Angular
│ └── server //NestJS
├── packages
| ├── database //Prisma schemas for both PostgreSQL and MongoDB
| ├── eslint-config-custom //Shared ESLint configuration
| ├── shared //Validation rules and types
└──python_apps
├── image_detect_flask //Flask server (for image detection)
└── train_and_classification //Python scripts for training and classification
- Docker Desktop
- Visual Studio Code
- Dev Containers extension for VS Code
- MongoDB server
- Download Docker Desktop:
Visit the Docker Desktop download page and choose the package suitable for your system. - Launch Docker Desktop:
After installation, launch the Docker Desktop application. - Install Dev Containers Extension:
In VS Code, navigate to the Extensions Marketplace and install the 'Dev Containers' extension. - Clone The Repository:
Clone this repository locally and open it in VS Code. - Insert MongoDB Connection String:
In the.devcontainer/dev.env
file, replace theMONGODB_DATABASE_URL
value with your MongoDB connection string.
-
Build The Dev Container:
PressF1
orCtrl+Shift+P
to open the VS Code command palette. Type and select: 'Dev Containers: Build and Open in Container' (or 'Dev Containers: Rebuild and Reopen in Container'). The container build process will start; this may take several minutes on the first run. -
Run Project:
After the build completes, wait for the ZSH terminal to display 'Done. Press any key to close the terminal'.Then, press
F1
orCtrl+Shift+P
again to open the VS Code command palette. Select 'Tasks: Run Task' and choose 'Run Server, Client, and Flask'.- Alternatively, you can run the following commands:
pnpm -F server dev
,pnpm -F client dev
, andcd python_apps/image_detect_flask/ && flask --app main run
.
- Alternatively, you can run the following commands:
-
Open the Web Application:
When the tasks have finished loading, open your web browser and go to https://localhost:4200 to access the client-side homepage.
To access the UI of the databases (Prisma Studio), go back to 'Run Task' in VS Code and choose 'Run Prisma Studio (Postgres)', and then run another task and choose: 'Run Prisma Studio (Mongo)'.
- Alternatively, you can run the following commands:
pnpm -F database postgres:studio
, andpnpm -F database mongo:studio
.
We utilized supervised models to distinguish between real and fake facial images, employing high-frequency component analysis. This technique is based on the following article:
Durall, R., Keuper, M., Pfreundt, F. J., & Keuper, J. (2019). Unmasking deepfakes with simple features.
arXiv preprint arXiv:1911.00686.
Additionally, we referred to the code shared by the article's authors on GitHub.
Identification Process:
- The user uploads an image.
- The image is transmitted to the Flask server.
- Detection and focus are performed on the face (if it's not a face, an appropriate message is sent back to the user).
- The image is then converted to a resolution of 256x256 and subsequently transformed into a one-dimensional spectrum.
- The models execute the prediction.
- The prediction result is sent to the user.
Training Data and Models:
The supervised models underwent training on 38,400 images of human faces, categorized into the following datasets:
Fake:
- 100KFake: 6,400
- ThisPersonDoesnotExist: 12,800
Real:
- CelebA: 6,400
- CelebA-HQ: 6,400
- Flickr-Faces-HQ: 6,400
To enhance accuracy, three models were employed, and the final prediction result is determined by the majority prediction across these models. The models used are:
- Support Vector Machine (SVM) – scikit-learn
- Logistic Regression – scikit-learn
- Neural Network (NN) – TensorFlow
Each model provides a binary answer: 0 for a real face and 1 for a fake face.
👤 Saar Rozenthal
👤 Yuval Tzoor
👤 Shaked Partush
👤 Yuval Abramovich