Architecture

Technical overview of s3finder's internal design and components.

System Overview

┌─────────────────────────────────────────────────────────────────┐ │ SCANNER ORCHESTRATOR │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ CT Logs │ │ Wordlist │ │ AI Generator │ │ │ │ (crt.sh) │ │ Loader │ │ (optional) │ │ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ │ │ │ │ └───────────────────┴───────────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────────┐ │ │ │ Permutation │ │ │ │ Engine │ │ │ └────────┬─────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────────┐ │ │ │ names channel │ (buffered, size=1000) │ │ └────────┬─────────┘ │ │ │ │ │ ┌──────────────────┼──────────────────┐ │ │ ▼ ▼ ▼ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ Worker 1 │ │ Worker 2 │ │ Worker N │ │ │ │ Prober │ │ Prober │ │ Prober │ │ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ │ │ │ └──────────────────┼──────────────────┘ │ │ ▼ │ │ ┌──────────────────┐ │ │ │ results channel │ │ │ └────────┬─────────┘ │ │ │ │ │ ┌───────────┴───────────┐ │ │ ▼ ▼ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │ Inspector │ │ Output │ │ │ │ (SDK deep) │ │ Writer │ │ │ └─────────────┘ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

Project Structure

text
s3finder/
├── cmd/s3finder/          # CLI entrypoint
│   └── main.go            # Cobra CLI setup, flag parsing
├── pkg/
│   ├── scanner/           # Core scanning logic
│   │   ├── scanner.go     # Worker pool orchestrator
│   │   ├── prober.go      # HTTP bucket checker
│   │   └── inspector.go   # AWS SDK deep inspection
│   ├── ai/                # AI name generation
│   │   ├── generator.go   # Provider interface
│   │   ├── openai.go      # OpenAI implementation
│   │   ├── ollama.go      # Ollama implementation
│   │   ├── anthropic.go   # Anthropic implementation
│   │   └── gemini.go      # Gemini implementation
│   ├── recon/             # Reconnaissance modules
│   │   └── ctlogs.go      # CT log subdomain discovery (crt.sh)
│   ├── permutation/       # Name generation
│   │   └── engine.go      # Suffix/prefix/year patterns
│   ├── ratelimit/         # Rate limiting
│   │   └── adaptive.go    # AIMD rate limiter
│   └── output/            # Result output
│       ├── writer.go      # Writer interface
│       ├── realtime.go    # Terminal output
│       └── report.go      # JSON/TXT reports
├── internal/config/       # Configuration
│   └── config.go          # Config structs, defaults
├── wordlists/             # Default wordlists
│   └── common.txt         # 130 common patterns
├── Makefile               # Build automation
└── .goreleaser.yaml       # Release automation

Core Components

Scanner Orchestrator

Coordinates the entire scanning process. Manages worker pool, channels, and graceful shutdown.

pkg/scanner/scanner.go

HTTP Prober

Performs HEAD requests to S3 endpoints. Features HTTP/2 support, connection pooling, keep-alives, and automatic retry with exponential backoff.

pkg/scanner/prober.go

Deep Inspector

Uses AWS SDK to gather detailed bucket info: region, ACL, object listing. Handles region mismatch with automatic retry.

pkg/scanner/inspector.go

Adaptive Rate Limiter

AIMD algorithm that auto-adjusts based on 429/503 responses from AWS. Floor at 20 RPS for usability.

pkg/ratelimit/adaptive.go

Custom DNS Resolver

Uses Google (8.8.8.8) and Cloudflare (1.1.1.1) DNS to prevent local resolver saturation during high-volume scans.

pkg/dns/resolver.go

Progress Bar

Live TUI progress display showing scanned count, RPS, ETA, and discovery statistics in real-time.

pkg/output/progress.go

Permutation Engine

Generates 780+ bucket name variations using common patterns.

pkg/permutation/engine.go

CT Log Client

Queries crt.sh to discover subdomains via Certificate Transparency logs.

pkg/recon/ctlogs.go

AI Generator

Provider-agnostic interface for LLM-powered name generation.

pkg/ai/generator.go

Data Flow

  1. CT Log Discovery (optional): Query crt.sh for subdomains of target domain
  2. Name Generation: Wordlist + CT seeds + Permutation Engine + AI (optional) produce unique bucket names
  3. Channel Feed: Names are fed into a buffered channel (backpressure-safe)
  4. Worker Pool: N workers read from channel, each with its own HTTP client
  5. Probing: HTTP HEAD request to https://{bucket}.s3.amazonaws.com
  6. Rate Limiting: Each request passes through adaptive rate limiter
  7. Result Processing: Hits (200/403) go to Inspector for deep analysis
  8. Output: Results stream to terminal and accumulate for final report

Probe Results

HTTP StatusResultMeaning
200BucketExistsBucket is publicly readable
403BucketForbiddenBucket exists but access denied
404BucketNotFoundBucket does not exist
301/307BucketForbiddenRedirect (bucket in different region)
OtherBucketErrorNetwork error or unexpected response

Concurrency Model

s3finder uses Go's goroutines and channels for high-performance concurrent scanning:

  • Producer: Single goroutine feeds names into buffered channel
  • Consumers: N worker goroutines read from channel and probe buckets
  • Results: Workers send results to output channel
  • Shutdown: Context cancellation propagates to all workers
The buffered channel (size 1000) provides backpressure - if workers can't keep up, the producer blocks until space is available.

Dependencies

PackagePurpose
github.com/spf13/cobraCLI framework
github.com/aws/aws-sdk-go-v2AWS SDK for deep inspection
github.com/sashabaranov/go-openaiOpenAI API client
golang.org/x/time/rateToken bucket rate limiter
golang.org/x/sys/windowsWindows ANSI color support