Technology & Algorithms

A deep dive into the architecture, the RAG pipeline, and the mathematical-technical models behind Balou Tools.

Confused about terms or abbreviations? → To the Glossary

System Architecture Tech Stack RAG & AI Pipeline Vector Search & Reranking Scoring Models Security & Resilience

🏗️

System Architecture & High-Level Data Flow

Balou Tools is designed as a high-performance, modern Developer Diagnostic Platform. The system architecture strictly separates the statically optimized, reactive frontend from a stateless Spring Boot backend. The backend utilizes virtual threads to parallelize external network requests.

Internet / Client Browser · HTTPS

Nginx Reverse Proxy SSL Termination · Routing

HTTPS · REST · Server-Sent Events (SSE)

Frontend Astro 6.4 · Svelte 5.55 Islands · Node.js Standalone (SSR)

REST / SSE API

Backend · Spring Boot 4.0.6

Java 25 · Virtual Threads Runtime

DNS Check SSL Check HTTP Headers PageSpeed API AI / RAG Core Multi-Tenant Workspace Caffeine Cache Rate Limiter SSRF / DNS-Rebind Guard

JDBC · Cache Protocol · HTTPS

PostgreSQL + pgvector HNSW · 768 Dim · Port 5433

Redis Distributed Cache · Port 6379

Google Gemini API gemini-2.5-flash-lite · gemini-embedding-001

Architecture Principles:

Virtual Threads (Java 25): Enable spawning a lightweight thread for each blocking I/O operation (DNS queries, SSL handshakes, HTTP calls). This eliminates the classic thread pool bottleneck.
Privacy-First & Local Execution: Tools for text manipulation, hashing, Base64 encoding, JWT decoding, and generators run 100% client-side in the user's browser. No transfer of this sensitive data to the server takes place.
Cache-First: Reduction of latency times and API costs through a multi-stage cache infrastructure (in-memory Caffeine cache and distributed Redis cache).

💻

Technology Stack

The components of Balou Tools are based on established and future-proof core technologies with a focus on SEO performance, robustness, and extremely fast load times.

Area	Technology	Version	Purpose / Function
Frontend	Astro Framework	6.4.x	Meta-framework for Static Site Generation (SSG) & SSR, SEO optimization
Client-UI	Svelte	5.55.x	Interactive, reactive UI islands using the new Runes API
Backend	Spring Boot	4.0.6	REST APIs, Dependency Injection, system orchestration
Sprache	Java / TypeScript	25 / 6.0	Virtual threads, records, type safety in frontend and backend
KI-Integration	Spring AI	2.0.0-RC1	Standardized abstraction for vector databases and LLM calls
Sprachmodell	Google Gemini	API	LLM: `gemini-2.5-flash-lite` · Embeddings: `gemini-embedding-001`
Datenbank	PostgreSQL + pgvector	16+	Relational data store & vector database (HNSW indices, 768 dimensions)
Caching	Caffeine & Redis	7.x	Local, extremely fast in-memory cache & distributed cache for scaling

🤖

AI & RAG Pipeline

The AI-supported error diagnosis is based on a Retrieval-Augmented Generation (RAG) pipeline. It enriches the raw data from the diagnostic tools with specific context from the internal Balou knowledge base.

Input Protection (SensitiveInputGuard): Every user request or raw diagnostic data set is scanned for sensitive data (such as passwords, API keys, cookie headers, JWTs) using regular expressions and classification methods. These are redacted before being passed to the pipeline.

Semantic Cache (SemanticCacheService): To save costs and latency, the query is vectorized. A cosine similarity of ≥ 95% is used to check if a semantically identical question has already been answered. In case of a match, the cached response is served directly without calling the LLM.

Query Rewriter: If the query is imprecise, the QueryRewriterService reformulates it. This optimizes the subsequent retrieval in the vector database.

Vector Search (pgvector HNSW): The optimized query searches the vector_store table for topically relevant document chunks. Up to 4 chunks (Top-K = 4) exceeding the minimum similarity threshold (Similarity Threshold = 0.6) are retrieved.

Prompt Synthesis (RagPromptBuilder): System prompts, tool diagnostic data, RAG contexts (knowledge base), and affiliate data (for contextual product recommendations) are merged into a structured prompt.

Generation & SSE Streaming: The LLM gemini-2.5-flash-lite generates the response, which is streamed character by character via Server-Sent Events (SSE) to the frontend. The response is output in a structured format (summary, severity, causes, corrective steps, sources, confidence score).

Knowledge Base & Ingestion: New markdown documents in the doc/ path are automatically read (KnowledgeIngestionService). A ChunkingService splits texts into overlapping sections, enriches them with metadata (category, tool key, version), generates a 768-dimensional vector via gemini-embedding-001, and stores it in the PostgreSQL HNSW vector index.

🧮

Vector Search & Reranking Algorithm

The raw search results from the database (pgvector cosine similarity) are often not sufficient to find the most precise support. Therefore, the VectorSearchService applies a composite scoring reranking algorithm with five factors.

Reranking Score Formula

Score = Sim_Cosine + Bonus_Category + Bonus_Keywords + Bonus_Language + Bonus_Workspace

The factors in detail:

Base Cosine Similarity (Sim_Cosine)

The mathematical similarity between the query vector and the document vector (value range 0.0 to 1.0). A value of ≥ 0.6 is required for consideration.

+ 0.20 Category Bonus (Bonus_Category)

If the category of the knowledge base chunk (e.g., 'SSL') matches the active diagnostic tool exactly, the chunk receives a relevance bonus of 0.2.

+ 0.02 - 0.05 Keyword Match (Bonus_Keywords)

In addition to vector similarity, a lexical search is performed. A small bonus is added for each occurrence of key keywords from the query in the chunk.

+ 0.10 Language Match (Bonus_Language)

If the language of the chunk (e.g., 'de' for German) matches the user interface of the requesting user, the relevance is increased by 0.1.

+ 0.10 Workspace Match (Bonus_Workspace)

In multi-tenant SaaS operation, documents associated with the user's specific workspace (Tenant Knowledge) receive a prioritization bonus of 0.1.

📊

Scoring & Evaluation Models

The diagnostic tools evaluate target systems with a score (0 to 100) and derive a grade (Grade A to F). The logic is encapsulated in the central ScoringEngine and is based on the following models:

1. SSL/TLS Audit (Subtractive Scoring, starting value: 100)

Starting from 100 points, security flaws lead to point deductions. The final result is clamped to the range [0, 100]:

Finding / Vulnerability	Deduction (Points)	Category / Severity
Certificate invalid (expired, hostname mismatch)	-100 (Score = 0)	CRITICAL
Certificate chain not trusted (Trust Store)	-40	CRITICAL
SSLv3 active / Weak cipher suite (RC4, 3DES, MD5)	-30	CRITICAL
Weak RSA/DSA key (< 2048 bits) or signature (SHA-1)	-30	CRITICAL
TLS 1.0 / TLS 1.1 active or incomplete chain	-20	WARN
Remaining lifetime ≤ 14 days `≤ 14` (≤ 30 days)	-20 (-10)	CRITICAL / WARN
Weak EC curve (< 256 bits)	-15	WARN
OCSP stapling inactive / No CT proof (SCT)	-5	INFO

Checked Security Features

The following features are collected and evaluated per certificate or TLS endpoint:

Validity: Is the certificate temporally valid (not expired / already active) and technically correct?
Hostname Match: Does the certificate (CN / SAN) cover the requested hostname?
Remaining Lifetime: Remaining days until expiration, including early renewal warnings (≤ 30 / ≤ 14 days).
Certificate Chain: Completeness of the chain (existing intermediate certificates).
Trustworthiness: Validation of the chain against the recognized system trust store (detection of self-signed / incorrectly chained certificates).
OCSP Stapling: Does the server deliver the revocation status (OCSP) itself?
Protocol Versions: Detection of obsolete/insecure versions (SSLv3, TLS 1.0, TLS 1.1) compared to modern ones (TLS 1.2 / 1.3).
Cipher Suites: Detection of weak encryption suites (RC4, 3DES, DES, NULL, EXPORT).
Key Strength: Evaluation of the key length (RSA/DSA ≥ 2048 bits, EC curves ≥ 256 bits).
Signature Algorithm: Detection of broken algorithms (MD5, SHA-1) compared to SHA-256+.
Certificate Transparency: Presence of embedded SCTs (Signed Certificate Timestamps).

2. Security Headers (Subtractive Scoring, starting value: 100)

Starting from 100 points, missing or weak HTTP security headers lead to point deductions. The evaluation is endpoint- and context-aware: schemaless inputs are checked over HTTPS, and pure hardening headers (defense-in-depth) are listed as INFO without deduction, so that an otherwise clean site is not downgraded.

Finding / Security Header	Deduction (Points)	Category / Severity
`Strict-Transport-Security` (HSTS) missing (over HTTPS)	-25	CRITICAL
`Content-Security-Policy` (CSP) missing completely	-25	CRITICAL
CSP only in Report-Only mode (present, not enforced)	-5	WARN
`X-Frame-Options` missing (clickjacking protection)	-12	WARN
`X-Content-Type-Options` missing (defense-in-depth)	0	INFO
`Referrer-Policy` missing (defense-in-depth)	0	INFO
`Permissions-Policy` missing (defense-in-depth)	0	INFO
HSTS checked over pure HTTP endpoint (RFC 6797)	0	INFO

Endpoint & context-aware evaluation: Since HSTS according to RFC 6797 only applies to a TLS-secured response, a missing HSTS on a pure HTTP endpoint is not evaluated as a critical vulnerability. An existing Content-Security-Policy-Report-Only is considered present (monitored but not enforced) and not as 'missing completely'. A functioning HTTP→HTTPS redirect is recorded positively as a passed check.

3. PageSpeed & Domain-Health Aggregation

PageSpeed Score: Corresponds to the performance score (0–100) provided by the Google PageSpeed Insights API. For parallel measurement of mobile and desktop views, the arithmetic mean is calculated.
Domain-Health Score (Aggregate): Forms the unweighted arithmetic mean of all successful sub-scores (DNS, SSL, Security Headers, PageSpeed). If a sub-check fails due to timeouts, the system status changes to PARTIAL, but the remaining average is still output.
Score-to-Grade Assignment: A (90-100) · B (80-89) · C (70-79) · D (60-69) · E (50-59) · F (0-49)

4. DNS Audit (Special Rules, Deductions & JNDI Limitation)

DMARC Compensation (SPF): If a domain has an SPF soft-fail (~all) but is simultaneously protected by a restrictive DMARC policy (p=reject or p=quarantine), the standard deduction of 10 points is suspended. The finding is classified as INFO instead of WARN because the spoofing protection is guaranteed by DMARC.
Adjusted Deductions for CAA & DNSSEC: Since the absence of CAA records or an inactive DNSSEC does not cause direct system failures but represents recommendations, the deduction for both findings was reduced from 10 to **5 points** each. This better reflects the actual security risk.
JNDI CAA Limitation: The DNS check uses the integrated Java JNDI directory for resolution. In certain Java runtimes, JNDI has limitations when querying CAA records (type 257), which can lead to false alarms ('No CAA record found'). To mitigate unjustified point deductions due to JNDI errors, the CAA deduction is limited to a moderate 5 points.
Propagation Consistency (0 points deduction): Differences in IP responses from public DNS resolvers (Google, Cloudflare, Quad9) often indicate GeoDNS load balancing or CDN Anycast services (such as with google.com). This finding deducts **0 points** (previously -10) and is classified as pure INFO, as it is a legitimate network property and not a misconfiguration.

🛡️

Security & Resilience Architecture

As a diagnostic tool, Balou Tools executes requests to arbitrary target systems on behalf of the user. This requires profound security measures to protect its own backend and safeguard user privacy.

SSRF Protection (Server-Side Request Forgery)

Every URL entered by the user and every single redirect hop is validated by the UrlSafetyValidator, checking all resolved A/AAAA records (IPv4 and IPv6). The following are blocked:

Private IP ranges (RFC 1918, e.g., 10.0.0.0/8, 192.168.0.0/16)
Loopback addresses (IPv4 127.0.0.1, IPv6 ::1)
Link-local addresses (RFC 3927, e.g., 169.254.0.0/16 incl. cloud metadata 169.254.169.254)
Carrier-grade NAT (100.64.0.0/10), reserved blocks (0.0.0.0/8, 240.0.0.0/4), as well as multicast and unspecified addresses (0.0.0.0, ::)
IPv6 unique local addresses (fc00::/7)
Protocols other than HTTP/HTTPS as well as target ports other than 80 and 443

DNS-rebinding & TOCTOU protection: Since the JDK HttpClient resolves DNS names itself, a central SafeHttpClientFactory re-resolves the hostname immediately before every outbound request and re-checks all returned addresses. Automatic redirects are disabled (Redirect.NEVER); each hop is validated individually. This closes the window in which an attacker with controlled DNS could flip the record to an internal address after the initial check.

IP Anonymization & Rate Limiting

To protect against abuse and denial-of-service attacks, IP-based rate limiting is active at the endpoints. In order to comply with data privacy regulations (GDPR/DSG), IP addresses are not stored in plain text. The IP address is hashed using **HMAC-SHA256** with a server-side pepper (RATELIMIT_IP_HASH_SECRET). The counter expires on a rolling basis after 24 hours.

Resilience & Timeout Policy

To prevent hanging backend threads due to slow external servers, Balou Tools enforces strict timeouts: a maximum of 3 seconds for connection establishment (Connect) and 5 seconds for reading data (Read). For transient errors, a maximum of one (1) retry with a short backoff (200-500 ms) is executed. Detailed error messages are hidden, and a standardized correlation ID is provided to the client.