
Web Application Reverse Engineering
Step 1: Application Vector Analysis
-
Subdomains Analysis :
api.example.com
→ Backend api based functionsadmin.example.com
→ Privileged operations
-
Technology and Architecture Fingerprinting
- By Tools:
whatweb
,Wappalyzer
-
Architecture pattern
- MVC (Model-View-Controller)
Django, Ruby on Rails, Laravel.
- Common Routes:
/users
,/posts
,/dashboard
- Expected Responses:
- HTML with server-side rendering.
- Authentication Flow:
- Cookies & sessions for login persistence.
- Common Routes:
- MVVM / SPA
Angular, Vue.js (if client-side rendering dominates).
- Common Routes:
/api/v1/users
,/api/v1/posts
- Expected Responses:
- JSON responses (minimal HTML).
- Authentication Flow:
- JWT (JSON Web Token) or OAuth-based authentication.
- Common Routes:
- Microservices Architecture
- Common Patterns:
- Separate APIs and Subdomains handling different functionalities:
auth.api.com
→ Authenticationpayment.api.com
→ Payment processinguser.api.com
→ User management
- Expected Responses:
- JSON responses
- Redirection
- Authentication Flow:
- API Gateway handling requests, JWT-based authentication.
- Common Patterns:
- Monolithic: Few subdomains (
www
,blog
,static
).
- By Tools:
-
Server-Side Rendered (SSR) vs. Single Page App (SPA)?
- SSR → Backend routes dynamically generate HTML.
- SPA → Heavy JavaScript with API-driven functionality.
-
Client-Heavy vs. Server-Heavy Logic?
- Client-Heavy → Reverse JS & Frontend State Handling.
- Server-Heavy → Extract Backend Routes & API Interaction.
Step 2: Application url Analysis
-
Path Analysis:
Endpoints can be analyzed through their URI structure to infer potential functionality:
- Resource-Oriented Paths
Example:/users/{id}/profile
Indication: Typically follows RESTful CRUD (Create, Read, Update, Delete) patterns, suggesting operations on user profile resources. Common HTTP methods:GET
(Retrieve)POST
/PUT
(Create/Update)DELETE
(Remove)
- Action-Oriented Paths
Example:/checkout/payment
Indication: Represents a transactional workflow, likely involving:- Payment processing
- Order finalization
- State transitions in a checkout sequence
- Resource-Oriented Paths
-
Query Parameters Analysis:
?action=edit
→ Implies a state-modifying function?format=json
→ Suggests an API endpoint
Step 3 :Function & API Analysis
API classifications and Analysis
Aspect | RESTful API | HTML Response (MVC-Based APIs) | GraphQL API | Microservices API |
---|---|---|---|---|
Format | JSON | HTML | JSON | JSON or other (varies per service) |
Example | json { "id": 1, "name": "John Doe", "email": "[email protected]" } | html <html><head><title>User Profile</title></head><body><h1>John Doe</h1></body></html> | json { "data": { "user": { "id": "1", "name": "John Doe" } } } | JSON responses are common across different services. |
Endpoint(s) | Multiple structured endpoints (/users , /posts ) | Server-side rendered page (no true API) | Single /graphql endpoint with flexible queries | Separate subdomains for each microservice (e.g., auth.api.com , payment.api.com ) |
Request Type | Standard HTTP methods (GET, POST, PUT, DELETE) | Typically a GET request for page rendering | Flexible queries using POST requests | Varies per service (usually HTTP requests) |
Data Structure | Key-value pairs (JSON) | HTML content with embedded dynamic data | Key-value pairs with flexible queries (only requested fields) | Key-value pairs (JSON) with varied structures depending on service |
Backend Function | Exposes specific functionality as a service, typically for data retrieval and management | Server-side rendered page with backend logic for rendering views | Exposes specific functionality based on queries | Each microservice exposes specific functionality related to its domain |
Use Case | Suitable for applications needing data exchange with defined endpoints. | Suitable for rendering pages directly on the server. | Ideal for frontend applications with dynamic, flexible queries. | Ideal for large systems with modular, isolated services (e.g., authentication, payment) |
State | Stateless (each request is independent and has its own state) | Typically stateful (server retains session information) | Stateless (requests are independent) | Stateless (services can be independently scaled) |
Example Use Case | E-commerce website: retrieving user profile (/users/1 ), posting data (/posts ) | Server-rendered user profile page with user data | Flexible user data retrieval: /graphql?query={user(id: 1){name}} | Independent services for user management (auth.api.com ), payments (payment.api.com ) |
Functions Mapping
- CRUD Operations Endpoints
GET
,POST
methods for API Function EndpointsPUT
,PATCH
,DELETE
methods
- Authentication Endpoints:
- JWT (JSON Web Token): Commonly used in REST APIs and microservices. JWTs are typically passed in HTTP headers (e.g.,
Authorization: Bearer <token>
). - OAuth2: Used for delegating authorization to external identity providers (e.g., Google, Facebook).
- API Keys: Simple way to authenticate clients, especially in public APIs.
- Basic Auth: Simple authentication mechanism but less secure.
- Session-based Authentication: Typically used in MVC-based architectures, storing session data server-side.
- JWT (JSON Web Token): Commonly used in REST APIs and microservices. JWTs are typically passed in HTTP headers (e.g.,
- Authorization Endpoints:
- Role-based Access Control (RBAC): Restricting access to resources based on the user’s role.
- Attribute-based Access Control (ABAC): More granular, where access is based on attributes (e.g., user’s department, region).
- OAuth2 Scopes: For fine-grained access control in OAuth-based systems.
- Rate Limiting Endpoints
- Rate Limiting: Imposes a maximum number of requests per client, per unit of time (e.g., 100 requests per minute).
- Token Bucket: Controls the rate at which requests are allowed. Tokens are added to a bucket at a fixed rate, and requests consume tokens from the bucket.
- Leaky Bucket: Similar to Token Bucket but with a more predictable flow of requests.
- Throttling Behavior Temporarily blocks further requests from a client once the limit is reached, or provides an error response such as HTTP 429 (Too Many Requests).
Step 4: Object and Data Analysis from Responses
Data Behavior
-
JSON Data
- Lightweight structure (compared to XML). - Compatibility with virtually all programming languages (Python, Java, Go, .NET, etc.
). - Easy conversion to objects in most languages (JSON.parse()
in JavaScript,json.loads()
in Python). - Schema Design & Versioning - JSON is schema-less, but API designers can enforce structure using JSON Schema. - Versioning Strategies: - Use URL versioning:/api/v1/users
- Use header-based versioning:Accept: application/vnd.company.v1+json
- Use field-based versioning:{ "version": "1.0", "data": { ... } }
- JSON responses caching layers:
- Edge caching (CDN like Cloudflare, Akamai)
- Application-level caching (Redis, Memcached)
- HTTP Cache-Control headers (
ETag, Last-Modified
) - Compression techniques: - Gzip or Brotli to reduce payload size. - Minimizing JSON fields to avoid unnecessary data transfer.
- Handling Large Data Sets
- Pagination techniques:
- Offset-based:
/api/users?offset=0&limit=50
- Cursor-based:/api/users?cursor=eyJpZCI6IjEwMCJ9
- Batch Processing: - Instead of multiple small requests, use batch endpoints:json{"users": [1, 2, 3, 4, 5]}
- JSON responses caching layers:
- Edge caching (CDN like Cloudflare, Akamai)
- Application-level caching (Redis, Memcached)
- HTTP Cache-Control headers (
-
HTML Data
- SEO Benefits: Search engines can crawl SSR pages better than JS-heavy SPAs.
- Fast initial load times: No need to wait for client-side JS execution.
- Common in MVC frameworks:
- Django (Python) →
render(request, 'template.html', context)
- ASP.NET MVC (C#) →
return View(model)
- Ruby on Rails →
render "template.html.erb"
- Django (Python) →
- Templating & Reusability
- Component-based HTML rendering:
- Jinja2 (Python) →
{% include "navbar.html" %}
- Thymeleaf (Java Spring Boot) →
<th:block th:insert="navbar :: content" />
- Handlebars (Node.js) →
{{> partials/navbar}}
- Jinja2 (Python) →
- HTML Fragment APIs:
- Used in frameworks like HTMX, Turbo to dynamically update parts of a page without a full refresh.
- Component-based HTML rendering:
- Performance Optimization
- Minimizing HTML payloads:
- Remove redundant elements.
- Use lazy loading for images
<img loading="lazy">
.
- Edge caching (CDN like Cloudflare, Fastly):
- Cache full HTML responses for unauthenticated users.
- Minimizing HTML payloads:
-
GraphQL fetched Data
-
Data Modeling and Shcema
- Defining a GraphQL Schema (SDL - Schema Definition Language)
type User { id: ID! name: String! email: String } type Query { user(id: ID!): User }
- Resolvers & Data Fetching Logic
const resolvers = { Query: { user: (parent, args, context) => { return db.findUserById(args.id); } } };
-
Over-fetching & Under-fetching
- REST:
/api/users/1
→ Returns fixed fields (id, name, email, address, etc.) - Fetches only what is needed.
- REST:
-
Single Endpoint vs. Multiple Endpoints
- REST:
/api/users
,/api/posts
,/api/comments
- GraphQL:
/graphql
handles all requests.
- REST:
-
Query Optimization
- Batching & Dataloader Pattern:
- Instead of making multiple database calls, use DataLoader (Facebook’s library) to batch requests.
- Pagination & Filtering:
query { users(limit: 10, cursor: "xyz") { id name}}
- Batching & Dataloader Pattern:
-
GraphQL in Microservices
- Federation (Apollo Federation)
- Combines multiple GraphQL APIs into a single schema.
- Hybrid Approach (REST + GraphQL)
- Use GraphQL as a gateway for REST microservices.
- Federation (Apollo Federation)
-
Data Structure
Once server responses are obtained, analyze:
- Field names & types (IDs, timestamps, relationships).
- Primary keys (
id
)- Identifiers for database records (e.g.,
id
). - Found in
<p>
,<div>
, or embedded indata-
attributes.
- Identifiers for database records (e.g.,
- Date fields (
created_at
)- Indicates sorting & timeline features
- Primary keys (
- Nested Objects Relationships (User → Posts):
- Nested relationships (
users → posts
). - Used for structured data display (nested objects).
- Represented in lists (
<ul>
,<li>
) or tables (<table>
).
- Nested relationships (
- Links (
<a>
)- Shows relationships between resources.
- Forms (
<form>
)- Shows user-interaction points (e.g., login, search).
- Forms indicate data submission points, similar to
POST
orPUT
requests in REST APIs.
- Pagination & Sorting:
- Links like
?page=2
or?sort=asc
may be embedded in<a href="...">
elements. - Pagination in HTML is usually done through navigation buttons or numbered links:
- Links like
Step 5: Functional Hypothesis Table Based on the URL Behavioral Mapping
Deconstruct URL path semantics to reconstruct system design.
URL Pattern | Predicted Function | Likely Objects |
---|---|---|
/users | User listing (index) | User[] |
/users/:id | User profile (read) | User |
/users/:id/edit | Profile editing (update) | User , Form |
/products/search?q= | Product search (query) | Product[] |
/cart/add?item_id= | Cart modification (mutate) | Cart , Product |
/users/42/profile | Fetching user profile data | User { id, name, email } |
/cart/add?product=12 | Adding item to a shopping cart | Cart { items[] } |
/transactions/export.csv | Generates a transaction report | Transaction { id, total } |
Step 6: Page-Function-Object Matrix and Callbacks review
-
Extract Function Call backs from Requests Flow
-
Extract Call backs from front end JS code
grep -Eo "(fetch|axios|XMLHttpRequest).*" main.js linkfinder -i main.js -o results.txt
https://astexplorer.net/ https://code2flow.com/ https://mermaid.js.org/ https://d3js.org/ https://github.com/Persper/js-callgraph https://github.com/mermaid-js/mermaid
Page | Functions | Objects Used | HTTP Calls |
---|---|---|---|
/products | loadProducts() | Product[] | GET /api/products |
/cart | checkout() | Cart , Order | POST /api/checkout |
/users/:id | fetchUser() | User , Profile | GET /api/users/:id |
Step 7: Error Debugging
Frontend (Client-Side) Issues
Error Code | Scenario | Technology | Cause | Expected Behavior | Fix |
---|---|---|---|---|---|
300 Multiple Choices | Parameter leads to multiple possible pages | React, Angular, Vue | Query parameter redirects to multiple choices (like in search results) | Browser asks user for selection | Use stricter filtering or direct URL handling |
400 Bad Request | Missing mandatory query parameter | JavaScript (Vanilla, React, Angular) | Missing id in URL query | App crashes or defaults to empty result | Use fallback/default values or notify the user |
400 Bad Request | Missing only existing parameter | React Router, Angular Router | URL or query string is empty | Server returns 400 | Ensure URL parameters are present before routing |
422 Unprocessable Entity | Missing or incorrect data type for parameter | JavaScript, TypeScript | Type mismatch (string instead of int ) | Throws JavaScript error or fails silently | Validate data types using typeof or parseInt() |
500 Internal Server Error | WebAssembly type mismatch | WASM | Invalid memory read/write | Application crashes | Perform input validation and type-checking before execution |
Backend (Server-Side) Issues
PHP Frameworks (Laravel, Symfony, CodeIgniter)
Error Code | Scenario | Technology | Cause | Expected Behavior | Fix |
---|---|---|---|---|---|
300 Multiple Choices | Parameter leads to multiple routes | Laravel, Symfony | Multiple routes with similar parameters | Browser asks user for selection | Use more specific route definitions |
400 Bad Request | Missing required route parameter | Laravel, Symfony | id parameter missing in route | Returns 400 Bad Request | Use route validation or optional parameter handling |
400 Bad Request | Missing only existing parameter | CodeIgniter | Form or query parameters are empty | Server returns 400 | Validate form data before submission |
422 Unprocessable Entity | Invalid data type in request | Laravel, Symfony, CodeIgniter | Mismatched data types (e.g., string instead of int ) | Returns 422 | Use validation middleware or request->validate() |
500 Internal Server Error | Empty POST body | Laravel, Symfony | Missing body fields | Server returns 500 | Validate body content before processing |
500 Internal Server Error | Database query error (missing parameter) | Symfony, Laravel | Query lacks a required parameter (id for find() ) | Server returns 500 | Use findOrFail() or implement error handling for missing data |
404 Not Found | Route not found for parameter | Symfony, CodeIgniter | URL routing fails for missing or incorrect parameters | Server returns 404 | Implement 404 error handling with custom messages |
Java-based Frameworks (Spring Boot, Spring MVC)
Error Code | Scenario | Technology | Cause | Expected Behavior | Fix |
---|---|---|---|---|---|
300 Multiple Choices | Multiple redirects or resource selections | Spring Boot, Spring MVC | Multiple endpoints serving similar resources | Browser is asked to choose | Implement canonical routing or stricter filtering |
400 Bad Request | Missing required parameter in request | Spring Boot, Spring MVC | Query or form parameters are missing (id , name ) | Returns 400 Bad Request | Validate parameters before processing the request |
400 Bad Request | Missing only existing parameter | Spring MVC | Single parameter required for URL | Server returns 400 | Use @RequestParam with default value or validation |
422 Unprocessable Entity | Data type mismatch in request | Spring Boot | Passing a String where an int is required | Returns 422 | Use @Valid annotation for request validation |
500 Internal Server Error | Invalid database query (missing parameter) | Spring Boot | Query expecting id is not passed | Returns 500 | Ensure parameter validation in controller layer before query execution |
C#/.NET Frameworks (ASP.NET, ASP.NET Core)
Error Code | Scenario | Technology | Cause | Expected Behavior | Fix |
---|---|---|---|---|---|
300 Multiple Choices | Multiple routes handling same resource | ASP.NET, ASP.NET Core | Ambiguous routes with similar parameters | Browser is asked to choose | Use route constraints and more specific routes |
400 Bad Request | Missing required query parameter | ASP.NET, ASP.NET Core | Query parameter id is missing in request | Returns 400 | Use model binding and validation attributes like [Required] |
400 Bad Request | Missing only existing parameter | ASP.NET | Parameter missing from form or query | Returns 400 | Ensure that form/query validation is in place |
422 Unprocessable Entity | Incorrect parameter type (e.g., string for int) | ASP.NET Core | Type mismatch (e.g., string instead of int ) | Returns 422 | Use model validation to enforce correct types |
500 Internal Server Error | Invalid database query | ASP.NET, ASP.NET Core | SQL query with missing parameter | Returns 500 | Validate query parameters and handle missing fields |
Ruby on Rails (Ruby)
Error Code | Scenario | Technology | Cause | Expected Behavior | Fix |
---|---|---|---|---|---|
300 Multiple Choices | Multiple possible redirections | Ruby on Rails | Multiple routes with the same resource | Browser is asked to choose | Use stricter route definitions |
400 Bad Request | Missing required parameter | Ruby on Rails | Required parameter (id ) is missing | Returns 400 | Use strong parameters and default values |
500 Internal Server Error | Database query failure (missing parameter) | Ruby on Rails | Query expecting id but id is not passed | Returns 500 | Use find_or_create or find_by with proper error handling |
Python-based Frameworks (Django, Flask)
Error Code | Scenario | Technology | Cause | Expected Behavior | Fix |
---|---|---|---|---|---|
300 Multiple Choices | Parameter variation leads to multiple routes | Django | Multiple views with similar resource types | Browser is asked to choose | Use explicit URL matching or path converters |
400 Bad Request | Missing required parameter | Flask, Django | Missing id in URL or form data | Returns 400 Bad Request | Ensure parameter presence before processing |
422 Unprocessable Entity | Incorrect data type | Django, Flask | string parameter passed where integer is expected | Returns 422 | Use Django’s form validation or Flask’s request.args for type checking |
500 Internal Server Error | Missing data or parameters in DB query | Flask, Django | Query for missing parameter (id ) | Returns 500 | Use error handling or get_object_or_404 in Django |
JavaScript-based Frameworks (Express.js, NestJS)
Error Code | Scenario | Technology | Cause | Expected Behavior | Fix |
---|---|---|---|---|---|
300 Multiple Choices | Multiple endpoints handle similar resource | Express.js, NestJS | Ambiguous query or route | Browser is asked to choose | Use express.Router or @Controller to define explicit routes |
400 Bad Request | Missing required query or route parameter | Express.js, NestJS | Missing parameter (id , username ) in URL | Returns 400 Bad Request | Validate parameters using middleware or decorators (@Query , @Param ) |
422 Unprocessable Entity | Incorrect data type | Express.js, NestJS | Query parameter of wrong type | Returns 422 | Use validation pipes in NestJS or middleware in Express.js |
500 Internal Server Error | Missing or invalid body | Express.js, NestJS | Missing body or malformed data | Returns 500 | Use middleware like express.json() or @Body() in NestJS to validate request bodies |
Security Edge Cases & Caching
Error Code | Scenario | Technology | Cause | Expected Behavior | Fix |
---|---|---|---|---|---|
300 Multiple Choices | Redirect loop due to misconfigured routing | Nginx, Apache, Cloudflare | Conflicting redirects or query parameters | Infinite redirection loop | Implement canonical redirects, check for circular routes |
400 Bad Request | Proxy stripping of headers or parameters | Nginx, Apache, AWS ALB | Proxy strips or modifies headers or query params | Returns 400 | Configure reverse proxies to preserve headers and query params |
500 Internal Server Error | WAF blocking or modifying parameters | ModSecurity, AWS WAF | Parameters flagged as malicious | WAF blocks request | Use allowlist rules or bypass WAF for trusted sources |
Step 8: Generate Interactive Diagram
- Hierarchical Site Map
example.com
├── /home
├── /products
│ ├── /:id
│ └── /search
├── /cart
│ ├── /checkout
│ └── /payment
└── /users
├── /login
└── /:id
├── /profile
└── /settings
- Entity-Relationship Diagram (ERD)
mermaid
erDiagram
USER ||--o{ ORDER : places
ORDER ||--|{ PRODUCT : contains
CART ||--o{ PRODUCT : includes
- Functional Flowchart
mermaid
flowchart TD
A[Home] -->|Browse| B[Products]
B -->|Add to Cart| C[Cart]
C -->|Checkout| D[Payment]
D -->|Confirm| E[Order Completed]