Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Config option to keep the websurfx server connection alive for a certain period for subsequent requests #558

Closed
Tracked by #523
neon-mmd opened this issue Apr 26, 2024 · 3 comments · Fixed by #568

Comments

@neon-mmd
Copy link
Owner

neon-mmd commented Apr 26, 2024

Work Expected From The Issue

Provide a new config option to allow users to configure the amount of time for which the server should keep the connection with client alive.

The issue expects the following files to be changed/modified:

  • src/lib.rs
  • src/config/parser.rs
  • websurfx/config.lua

Note

All the files that are expected to be changed are located under the codebase (websurfx directory).

Reason Behind These Changes

The reason behind having these changes is to keep the server client connection alive for a certain period of time so that any subsequent requests sent to the server from the client during that period does not require re-estabilishing the connection which can take time to do so. Thus increasing the time between the client request and server response.

Sample Code

The sample codes for the files from the above mentioned files has been provided below:

lib.rs

//! This main library module provides the functionality to provide and handle the Tcp server
//! and register all the routes for the `websurfx` meta search engine website.

#![forbid(unsafe_code, clippy::panic)]
#![deny(missing_docs, clippy::missing_docs_in_private_items, clippy::perf)]
#![warn(clippy::cognitive_complexity, rust_2018_idioms)]

pub mod cache;
pub mod config;
pub mod engines;
pub mod handler;
pub mod models;
pub mod results;
pub mod server;
pub mod templates;

use std::{net::TcpListener, sync::OnceLock, time::Duration};

use crate::server::router;

use actix_cors::Cors;
use actix_files as fs;
use actix_governor::{Governor, GovernorConfigBuilder};
use actix_web::{
    dev::Server,
    http::header,
    middleware::{Compress, Logger},
    web, App, HttpServer,
};
use cache::cacher::{Cacher, SharedCache};
use config::parser::Config;
use handler::{file_path, FileType};

/// A static constant for holding the cache struct.
static SHARED_CACHE: OnceLock<SharedCache> = OnceLock::new();

/// Runs the web server on the provided TCP listener and returns a `Server` instance.
///
/// # Arguments
///
/// * `listener` - A `TcpListener` instance representing the address and port to listen on.
///
/// # Returns
///
/// Returns a `Result` containing a `Server` instance on success, or an `std::io::Error` on failure.
///
/// # Example
///
/// ```rust
/// use std::{net::TcpListener, sync::OnceLock};
/// use websurfx::{config::parser::Config, run, cache::cacher::create_cache};
///
/// /// A static constant for holding the parsed config.
/// static CONFIG: OnceLock<Config> = OnceLock::new();
///
/// #[tokio::main]
/// async fn main(){
///     // Initialize the parsed config globally.
///     let config = CONFIG.get_or_init(|| Config::parse(true).unwrap());
///     let listener = TcpListener::bind("127.0.0.1:8080").expect("Failed to bind address");
///     let cache = create_cache(config).await;
///     let server = run(listener,&config,cache).expect("Failed to start server");
/// }
/// ```
pub fn run(
    listener: TcpListener,
    config: &'static Config,
    cache: impl Cacher + 'static,
) -> std::io::Result<Server> {
    let public_folder_path: &str = file_path(FileType::Theme)?;

    let cache = SHARED_CACHE.get_or_init(|| SharedCache::new(cache));

    let server = HttpServer::new(move || {
        let cors: Cors = Cors::default()
            .allow_any_origin()
            .allowed_methods(vec!["GET"])
            .allowed_headers(vec![
                header::ORIGIN,
                header::CONTENT_TYPE,
                header::REFERER,
                header::COOKIE,
            ]);

        App::new()
            // Compress the responses provided by the server for the client requests.
            .wrap(Compress::default())
            .wrap(Logger::default()) // added logging middleware for logging.
            .app_data(web::Data::new(config))
            .app_data(web::Data::new(cache))
            .wrap(cors)
            .wrap(Governor::new(
                &GovernorConfigBuilder::default()
                    .per_second(config.rate_limiter.time_limit as u64)
                    .burst_size(config.rate_limiter.number_of_requests as u32)
                    .finish()
                    .unwrap(),
            ))
            // Serve images and static files (css and js files).
            .service(
                fs::Files::new("/static", format!("{}/static", public_folder_path))
                    .show_files_listing(),
            )
            .service(
                fs::Files::new("/images", format!("{}/images", public_folder_path))
                    .show_files_listing(),
            )
            .service(router::robots_data) // robots.txt
            .service(router::index) // index page
            .service(server::routes::search::search) // search page
            .service(router::about) // about page
            .service(router::settings) // settings page
            .default_service(web::route().to(router::not_found)) // error page
    })
    .workers(config.threads as usize)
+    .keep_alive(Duration::from_secs(
        config.server_connection_keep_alive as u64,
    ))
    // Start server on 127.0.0.1 with the user provided port number. for example 127.0.0.1:8080.
    .listen(listener)?
    .run();
    Ok(server)
}

config.lua

 -- ### General ###
logging = true -- an option to enable or disable logs.
debug = false -- an option to enable or disable debug mode.
threads = 10 -- the amount of threads that the app will use to run (the value should be greater than 0).

 -- ### Server ###
port = "8080" -- port on which server should be launched
binding_ip = "127.0.0.1" --ip address on the which server should be launched.
production_use = false -- whether to use production mode or not (in other words this option should be used if it is to be used to host it on the server to provide a service to a large number of users (more than one))
 -- if production_use is set to true
 -- There will be a random delay before sending the request to the search engines, this is to prevent DDoSing the upstream search engines from a large number of simultaneous requests.
request_timeout = 30 -- timeout for the search requests sent to the upstream search engines to be fetched (value in seconds).
tcp_connection_keepalive = 30 -- the amount of time the tcp connection should remain alive (or connected to the server). (value in seconds).
pool_idle_connection_timeout = 30 -- timeout for the idle connections in the reqwest HTTP connection pool (value in seconds).
rate_limiter = {
	number_of_requests = 20, -- The number of request that are allowed within a provided time limit.
	time_limit = 3, -- The time limit in which the quantity of requests that should be accepted.
}
https_adaptive_window_size = false -- Set whether the server will use an adaptive/dynamic HTTPS window size, see https://httpwg.org/specs/rfc9113.html#fc-principles

+ server_connection_keep_alive = 120 -- the amount of time the server should keep the connection open with the client for subsequent requests before closing it.

 -- ### Search ###
 -- Filter results based on different levels. The levels provided are:
 -- {{
 -- 0 - None
 -- 1 - Low
 -- 2 - Moderate
 -- 3 - High
 -- 4 - Aggressive
 -- }}
safe_search = 2

 -- ### Website ###
 -- The different colorschemes provided are:
 -- {{
 -- catppuccin-mocha
 -- dark-chocolate
 -- dracula
 -- gruvbox-dark
 -- monokai
 -- nord
 -- oceanic-next
 -- one-dark
 -- solarized-dark
 -- solarized-light
 -- tokyo-night
 -- tomorrow-night
 -- }}
colorscheme = "catppuccin-mocha" -- the colorscheme name which should be used for the website theme
 -- The different themes provided are:
 -- {{
 -- simple
 -- }}
theme = "simple" -- the theme name which should be used for the website
 -- The different animations provided are:
 -- {{
 -- simple-frosted-glow
 -- }}
animation = "simple-frosted-glow" -- the animation name which should be used with the theme or `nil` if you don't want any animations.

 -- ### Caching ###
redis_url = "redis:https://127.0.0.1:8082" -- redis connection url address on which the client should connect on.
cache_expiry_time = 600 -- This option takes the expiry time of the search results (value in seconds and the value should be greater than or equal to 60 seconds).
 -- ### Search Engines ###
upstream_search_engines = {
    DuckDuckGo = true,
    Searx = false,
    Brave = false,
    Startpage = false,
    LibreX = false,
    Mojeek = false,
    Bing = false,
} -- select the upstream search engines from which the results should be fetched.

parser.rs

//! This module provides the functionality to parse the lua config and convert the config options
//! into rust readable form.

use crate::handler::{file_path, FileType};

use crate::models::parser_models::{AggregatorConfig, RateLimiter, Style};
use log::LevelFilter;
use mlua::Lua;
use std::{collections::HashMap, fs, thread::available_parallelism};

/// A named struct which stores the parsed config file options.
pub struct Config {
    /// It stores the parsed port number option on which the server should launch.
    pub port: u16,
    /// It stores the parsed ip address option on which the server should launch
    pub binding_ip: String,
    /// It stores the theming options for the website.
    pub style: Style,
    #[cfg(feature = "redis-cache")]
    /// It stores the redis connection url address on which the redis
    /// client should connect.
    pub redis_url: String,
    #[cfg(any(feature = "redis-cache", feature = "memory-cache"))]
    /// It stores the max TTL for search results in cache.
    pub cache_expiry_time: u16,
    /// It stores the option to whether enable or disable production use.
    pub aggregator: AggregatorConfig,
    /// It stores the option to whether enable or disable logs.
    pub logging: bool,
    /// It stores the option to whether enable or disable debug mode.
    pub debug: bool,
    /// It toggles whether to use adaptive HTTP windows
    pub adaptive_window: bool,
    /// It stores all the engine names that were enabled by the user.
    pub upstream_search_engines: HashMap<String, bool>,
    /// It stores the time (secs) which controls the server request timeout.
    pub request_timeout: u8,
    /// It stores the number of threads which controls the app will use to run.
    pub threads: u8,
    /// It stores configuration options for the ratelimiting middleware.
    pub rate_limiter: RateLimiter,
    /// It stores the level of safe search to be used for restricting content in the
    /// search results.
    pub safe_search: u8,
    /// It stores the TCP connection keepalive duration in seconds.
    pub tcp_connection_keepalive: u8,
    /// It stores the pool idle connection timeout in seconds.
    pub pool_idle_connection_timeout: u8,
+    pub server_connection_keep_alive: u8,
}

impl Config {
    /// A function which parses the config.lua file and puts all the parsed options in the newly
    /// constructed Config struct and returns it.
    ///
    /// # Arguments
    ///
    /// * `logging_initialized` - It takes a boolean which ensures that the logging doesn't get
    /// initialized twice. Pass false if the logger has not yet been initialized.
    ///
    /// # Error
    ///
    /// Returns a lua parse error if parsing of the config.lua file fails or has a syntax error
    /// or io error if the config.lua file doesn't exists otherwise it returns a newly constructed
    /// Config struct with all the parsed config options from the parsed config file.
    pub fn parse(logging_initialized: bool) -> Result<Self, Box<dyn std::error::Error>> {
        let lua = Lua::new();
        let globals = lua.globals();

        lua.load(&fs::read_to_string(file_path(FileType::Config)?)?)
            .exec()?;

        let parsed_threads: u8 = globals.get::<_, u8>("threads")?;

        let debug: bool = globals.get::<_, bool>("debug")?;
        let logging: bool = globals.get::<_, bool>("logging")?;
        let adaptive_window: bool = globals.get::<_, bool>("adaptive_window")?;

        if !logging_initialized {
            set_logging_level(debug, logging);
        }

        let threads: u8 = if parsed_threads == 0 {
            let total_num_of_threads: usize = available_parallelism()?.get() / 2;
            log::error!(
                "Config Error: The value of `threads` option should be a non zero positive integer"
            );
            log::error!("Falling back to using {} threads", total_num_of_threads);
            total_num_of_threads as u8
        } else {
            parsed_threads
        };

        let rate_limiter = globals.get::<_, HashMap<String, u8>>("rate_limiter")?;

        let parsed_safe_search: u8 = globals.get::<_, u8>("safe_search")?;
        let safe_search: u8 = match parsed_safe_search {
            0..=4 => parsed_safe_search,
            _ => {
                log::error!("Config Error: The value of `safe_search` option should be a non zero positive integer from 0 to 4.");
                log::error!("Falling back to using the value `1` for the option");
                1
            }
        };

        #[cfg(any(feature = "redis-cache", feature = "memory-cache"))]
        let parsed_cet = globals.get::<_, u16>("cache_expiry_time")?;
        #[cfg(any(feature = "redis-cache", feature = "memory-cache"))]
        let cache_expiry_time = match parsed_cet {
            0..=59 => {
                log::error!(
                    "Config Error: The value of `cache_expiry_time` must be greater than 60"
                );
                log::error!("Falling back to using the value `60` for the option");
                60
            }
            _ => parsed_cet,
        };

        Ok(Config {
            port: globals.get::<_, u16>("port")?,
            binding_ip: globals.get::<_, String>("binding_ip")?,
            style: Style::new(
                globals.get::<_, String>("theme")?,
                globals.get::<_, String>("colorscheme")?,
                globals.get::<_, Option<String>>("animation")?,
            ),
            #[cfg(feature = "redis-cache")]
            redis_url: globals.get::<_, String>("redis_url")?,
            aggregator: AggregatorConfig {
                random_delay: globals.get::<_, bool>("production_use")?,
            },
            logging,
            debug,
            adaptive_window,
            upstream_search_engines: globals
                .get::<_, HashMap<String, bool>>("upstream_search_engines")?,
            request_timeout: globals.get::<_, u8>("request_timeout")?,
            tcp_connection_keepalive: globals.get::<_, u8>("tcp_connection_keepalive")?,
            pool_idle_connection_timeout: globals.get::<_, u8>("pool_idle_connection_timeout")?,
+            server_connection_keep_alive: globals.get::<_, u8>("server_connection_keep_alive")?,
            threads,
            rate_limiter: RateLimiter {
                number_of_requests: rate_limiter["number_of_requests"],
                time_limit: rate_limiter["time_limit"],
            },
            safe_search,
            #[cfg(any(feature = "redis-cache", feature = "memory-cache"))]
            cache_expiry_time,
        })
    }
}

/// a helper function that sets the proper logging level
///
/// # Arguments
///
/// * `debug` - It takes the option to whether enable or disable debug mode.
/// * `logging` - It takes the option to whether enable or disable logs.
fn set_logging_level(debug: bool, logging: bool) {
    if let Ok(pkg_env_var) = std::env::var("PKG_ENV") {
        if pkg_env_var.to_lowercase() == "dev" {
            env_logger::Builder::new()
                .filter(None, LevelFilter::Trace)
                .init();
            return;
        }
    }

    // Initializing logging middleware with level set to default or info.
    let log_level = match (debug, logging) {
        (true, true) => LevelFilter::Debug,
        (true, false) => LevelFilter::Debug,
        (false, true) => LevelFilter::Info,
        (false, false) => LevelFilter::Error,
    };

    env_logger::Builder::new().filter(None, log_level).init();
}
Copy link

To reduce notifications, issues are locked until they are 🏁 status: ready for dev and to be assigned. You can learn more in our contributing guide https://github.com/neon-mmd/websurfx/blob/rolling/CONTRIBUTING.md

@github-actions github-actions bot locked and limited conversation to collaborators Apr 26, 2024
@neon-mmd neon-mmd added this to the Complete v2.0.0 release milestone Apr 26, 2024
@github-actions github-actions bot unlocked this conversation May 3, 2024
Copy link

github-actions bot commented May 3, 2024

The issue has been unlocked and is now ready for dev. If you would like to work on this issue, you can comment to have it assigned to you. You can learn more in our contributing guide https://github.com/neon-mmd/websurfx/blob/rolling/CONTRIBUTING.md

@ddotthomas
Copy link
Contributor

I was thinking of changing the key name to client_connection_keep_alive to help differentiate between the keep alive timers for the server's client to the upstream search engines versus the client's connection to the HTTP server, and also modifying the description of the keys in the config.lua file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment