Technology
This page provides an overview of the technology and infrastructure used to run our sites.
Platform and Software
Wikis like Bahaipedia use the MediaWiki platform, the same software that powers Wikipedia. Using MediaWiki ensures our websites receive continuous updates and support, and users are likely to already be familiar with its user interface.
Infrastructure and Hosting
Our servers are distributed across four regions: Virginia (United States), Singapore, São Paulo, and Frankfurt. A distributed setup was chosen originally to ensure the fastest possible page rendering and file download times for users worldwide. Today it has become an effective strategy for isolating and mitigating sophisticated attempts to attack or crawl substantial volumes of content.
File Hosting and Distribution
Files uploaded to bahai.media are replicated to storage in Virginia, Brazil, Singapore, and Germany. Each region acts as an origin server for CloudFront, a Content Delivery Network, which caches and distributes files more broadly to reduce download times. If CloudFront can not locate a requested file in its cache, it makes a request to the closest available region. This setup significantly reduces download times worldwide, even in cases where files are not in the cache.
Uploads are processed in the region closest to the user, and then automatically replicated to all other regions. Some processes render files on demand, like viewing a specific page from a PDF stored on our site, and this configuration gets requested content to the users as quickly as possible.
Traffic Handling and Caching
We use a multi-layered caching strategy, primarily with Varnish and Redis, to manage traffic and reduce server load. Varnish is also our first line of defense against all forms of automated content scraping attempts, and can be adjusted dynamically to respond to various attacks.
How We Handle Traffic
Our technology stack routes and manages traffic using the following components:
- Nginx: Acts as the primary web server and reverse proxy, handling HTTPS traffic and forwarding requests to Varnish.
- Varnish: Caches and stores frequently accessed pages served to anonymous users, reducing load on the backend. Varnish also analyses incoming requests and will reject those matching certain patterns.
- Redis: A second caching layer, primarily for users who are logged in, storing details such as user sessions.
- Elasticsearch: Fast, full-text search capabilities across our content.
- CloudFront: Our CDN, which distributes and caches static assets (like images, videos, and documents) globally.
- AWS Lambda: Cache misses in the CDN are dynamically routed to the closest geographic origin bucket. Lambda also helps distribute files after they have been uploaded.
- Ansible and GitHub: Configuration files are centrally managed in GitHub, while Ansible automates server-specific changes before deploying updates across all regions.
Custom Extensions and Code
We develop custom MediaWiki extensions to add specific features to our wikis. For more technical details or to explore the custom code we use, visit our GitHub repository.