How to Find Any Company's Tech Stack (A Developer's Guide)

How to Find Any Company's Tech Stack (A Developer's Guide) - Overview
How to Find Any Company's Tech Stack (A Developer's Guide)

💡 Quick Summary

Posted on Feb 21 Let me be upfront about why most tech stack tools are kind of useless for developers. Tools like Wappalyzer and BuiltWith scrape cookies, meta tags, and frontend JavaScript. They'll tell you a company uses React and Google Analytics....

Posted on Feb 21

Let me be upfront about why most tech stack tools are kind of useless for developers.

Tools like Wappalyzer and BuiltWith scrape cookies, meta tags, and frontend JavaScript. They'll tell you a company uses React and Google Analytics. Cool. But you probably already guessed that.

What you actually want to know is: what does their backend look like? What's their data infrastructure? Do they use Datadog or Grafana for observability? What does their auth layer look like?

That information doesn't show up in a browser. It lives in DNS records, HTTP headers, subdomains, job postings, and public repos. And you can get all of it for free if you know where to look.

This guide is organized by depth — starting with the most technical, developer-specific methods and working toward the simpler passive ones. Skip to whatever level you need.

Part 1: Infrastructure Recon (Get your hands dirty)

Part 2: Passive Signals (High signal, zero effort)

This is the fastest technique in this entire guide for developers. Most companies expose an API at a predictable URL — api.company.com or api.company.io. You don't need credentials. Just hit it and read what comes back.

curl -sI https://api.company.com/

Even a 401, 403, or 404 response is packed with information. The infrastructure fingerprints itself in the headers.

Example: Running curl -sI https://api.utilimarc.com/ returns a 404, but the Apigw-Requestid header is a dead giveaway for AWS API Gateway. The Apigw- prefix is specific to AWS — no other provider uses it.

Go further: Also try curl -sI https://company.com to check their web infra. The CDN, load balancer, and sometimes even the backend framework leak through the top-level domain's headers too.

Every page load is a treasure map. The browser fetches scripts, fonts, pixels, and analytics from dozens of third-party services — all of which have unique hostnames that identify the vendor.

Open Chrome DevTools (F12), go to the Console tab, and run:

[...new Set(
  performance.getEntriesByType("resource")
    .map(r => {
      try { return new URL(r.name).hostname }
      catch { return null }
    })
    .filter(Boolean)
)]
  .filter(h => !h.includes(location.hostname))
  .sort()

This gives you every distinct third-party hostname loaded by the page, deduplicated and sorted, with the first-party domain filtered out.

Pro tip: Do this on the authenticated app (app.company.com), not just the marketing site. The marketing page is often a static site or different stack entirely. The actual product is where the interesting infra shows up — real-time event pipelines, feature flagging, product analytics, the works.

Paste the hostname list into an LLM and ask it to map each domain to a product. This is faster and more up-to-date than any static tool, since new vendor domains get recognized immediately.

CSP is a security feature that tells browsers which domains a site is allowed to load resources from or send data to. But for your purposes, it's a complete manifest of vendor integrations — because if a domain is in the CSP, the app is explicitly using it.

The CSP will be a long string of directives like connect-src, script-src, and img-src, each followed by a list of allowed domains.

Copy the entire CSP value and throw it into an LLM: "What developer tools and SaaS products correspond to these domains in this Content Security Policy?" You'll get a categorized breakdown in seconds.

Note: Not every site sets a CSP, and CSPs are sometimes set only on certain responses. If you don't see it, move on — but when it's there, it's one of the most explicit tech signals available.

Companies don't run everything on www. As infrastructure grows, services get their own subdomains — owned by different teams, deployed independently, with separate access controls. And they're often named very literally.

A subdomain called kafka-prod-b2.company.com tells you exactly what's running there. Same for elastic.company.com, grafana.internal.company.com, or consul.company.com.

The easiest free option is pentest-tools.com — they give you two free reports, which is enough for a research session. Enter the domain and get a list of discovered subdomains.

For a command-line approach, Amass is the gold standard:

amass enum -passive -d company.com
subfinder -d company.com -silent
elastic0.cbrs.iot.nokia.com         → Elasticsearch
kafka-prod-b2.enso.saas.nokia.com   → Kafka in production
pfsense.iot.nokia.com               → pfSense firewall
grafana.cbrs.iot.nokia.com          → Grafana dashboards
consul.cbrs.iot.nokia.com           → HashiCorp Consul

Just reading the names: they're running an ELK-adjacent stack with Kafka for streaming, Consul for service discovery, and Grafana for dashboards. That's a detailed architecture picture without touching a single line of code.

Once you have a list, run dig on interesting ones:

dig +short kafka-prod-b2.enso.saas.nokia.com

Then look up the IP in a tool like ipinfo.io or just check the PTR record:

dig +short -x <IP>

If it resolves to *.compute.amazonaws.com → AWS. *.googleusercontent.com → GCP. *.azure.com → Azure. Repeat across a few subdomains and you'll quickly see which cloud(s) they're on. Many large companies are multi-cloud, and the subdomain patterns often tell you which workloads live where.

This one is genuinely underused and feels like a cheat code.

Cisco Umbrella (formerly OpenDNS) operates one of the world's largest DNS resolver networks. Every day, they publish the top 1 million most queried domains and subdomains through their infrastructure — the Cisco Umbrella Popularity List. It's free, public, and updated daily.

Why this is different from subdomain enumeration: enumeration tools find subdomains that exist. The Umbrella list shows subdomains that are actively being used, based on real DNS traffic. This means you'll catch third-party SaaS tools with company-specific subdomains that would never appear in a passive scan.

# Download
curl -O http://s3-us-west-1.amazonaws.com/umbrella-static/top-1m.csv.zip
unzip top-1m.csv.zip

# Search for a company
grep -i "autodesk" top-1m.csv
import csv

company = "autodesk"
with open("top-1m.csv") as f:
    reader = csv.reader(f)
    matches = [(rank, domain) for rank, domain in reader if company in domain.lower()]

for rank, domain in matches:
    print(f"Rank {rank}: {domain}")

What I found when searching for "autodesk":

autodeskfeedback.az1.qualtrics.com     → Qualtrics for surveys
autodesk.enterprise.slack.com          → Slack Enterprise Grid
autodesk.pagerduty.com                 → PagerDuty for incident management
autodeskglobal.okta.com                → Okta for identity/SSO
autodeskglobal-ssl.mktoweb.com         → Marketo for marketing automation
autodesk.splunkcloud.com               → Splunk for log analysis
*.autodesk.com.edgekey.net             → Akamai CDN
notifications.api.autodesk.com         → Dedicated notifications microservice

From one grep, you can see their incident management stack (PagerDuty), their identity provider (Okta), their SIEM (Splunk), and their CDN (Akamai). Paid technographic tools almost never surface these, because they focus on frontend detection.

Caveat: This only works for companies with enough external traffic to appear in the top million. Smaller startups likely won't show up. But for anything mid-size or larger, it's one of the highest-signal free techniques available.

These methods require less technical effort but often reveal tools that the recon techniques above will completely miss — especially backend business tooling, internal SaaS, and vendor relationships.

When a SaaS tool needs to verify domain ownership for SSO or SAML integration, they require you to add a TXT record to DNS. These records persist long after the integration is live. They're public, unfakeable, and one of the strongest signals that a company actually uses a product.

dig TXT company.com +short
# Or for more detail:
dig TXT company.com ANY

Or use a GUI: dnschecker.org → choose "TXT" record type.

Real example — OpenAI's TXT records reveal:

notion-domain-verification=...        → Notion
atlassian-domain-verification=...     → Jira / Confluence
docker-verification=...               → Docker Hub
postman-domain-verification=...       → Postman
ms-domain-verification=...            → Azure AD / M365
miro-verification=...                 → Miro

Other dev-tool-related patterns to look for:

If a verification record exists, someone on the IT or infra team had to explicitly add it. That's a confirmed active integration.

Takes about 60 seconds and often reveals the most direct, unambiguous evidence of what a company actually builds with.

Start at github.com/{company-name}. Even if it's not linked from their website, it's usually guessable. Try the obvious names.

Language breakdown: GitHub shows a bar graph of languages across public repos. 40 repos in Go? That's a Go shop. Python-heavy with some Rust? That's a signal too.

Repo names: Companies often open-source internal tooling, SDKs, and infrastructure modules. Names like company-terraform-modules, company-kafka-consumer, or company-k8s-operators are literal descriptions of their infra.

Dependency files: Open any repo and check:

# Node projects
cat package.json | jq '.dependencies, .devDependencies'

# Python projects  
cat requirements.txt
cat pyproject.toml

# Go projects
cat go.mod

# Ruby
cat Gemfile

# Java/Kotlin
cat build.gradle
cat pom.xml

You don't need to understand the code. Just read the dependency names. A Python repo importing pyspark, delta-spark, and airflow tells you their data engineering stack. A Node repo pulling in @opentelemetry/api tells you they're doing structured observability with OpenTelemetry.

GitHub Actions workflows: This is often overlooked. Check .github/workflows/ in any repo. The workflow YAML files show their CI/CD setup, which testing tools they use, and what cloud they deploy to:

# Look for things like:
# - uses: aws-actions/configure-aws-credentials  → AWS
# - uses: google-github-actions/auth             → GCP
# - uses: hashicorp/setup-terraform              → Terraform
# - uses: docker/build-push-action               → Docker

Search npmjs.com for the company name, or try npmjs.com/~{org-name}. Published packages reveal the frontend frameworks they use and what internal tools they've built and open-sourced. A company publishing a design system built on React and Storybook tells you a lot about their frontend stack.

Head to huggingface.co/{company-name}. Useful for any company doing ML work:

Companies that handle personal data — especially for EU customers — are often legally required to disclose every third-party service that touches that data. These are subprocessors, and the lists get published in "Trust Center" or "Security" pages.

For developers, this is the fastest way to find out what SaaS infra a company is paying for: cloud providers, auth platforms, monitoring tools, data platforms.

Google: "[company name] subprocessors"
Google: "[company name] trust center"
Look: footer links labeled "Trust," "Security," or "Privacy"

Just from one screenshot, you can confirm: AWS for cloud, Auth0 for authentication, Sentry for error monitoring. That's three confirmed infrastructure choices in 30 seconds.

If the list is long, paste it into an LLM and ask it to group by function: "Here is a subprocessor list. Categorize each vendor by function: cloud infrastructure, auth/identity, observability, data storage, CI/CD, etc."

Not every company publishes one. But when they do, it's the most honest signal available — it's literally a receipt of their operational stack.

Status pages (status.company.com, or hosted on Atlassian Statuspage, Instatus, or Better Uptime) are designed for customer communication. But they contain two things that are valuable for tech recon:

The components list reveals architecture. The way a company breaks down their services tells you a lot. Separate components for US-East, EU-West, and APAC confirm multi-region. Separate statuses for API, WebSockets, Background Jobs, and CDN tell you how they've segmented their infrastructure.

Incident history reveals hidden dependencies. When systems fail, companies explain why — and that explanation often names an upstream vendor. This is especially powerful for finding security and infrastructure tools that never show up in DNS or network calls.

Example: After the CrowdStrike outage on July 19, 2024, dozens of companies posted status page incidents explicitly mentioning CrowdStrike. No scanner, DNS lookup, or job posting would ever reveal a company uses CrowdStrike EDR. But their own status page did.

The technique: if a vendor had a major outage on a known date, search status pages for companies that reported issues on the same day mentioning that vendor. You now have a list of confirmed customers.

When a company hires, they list the exact tools the hire will use. Engineering roles are obvious, but also look at: SRE and DevOps postings (infra stack), data engineering roles (pipeline tools), platform engineering (internal dev platform), and security roles (SIEM, EDR, vulnerability management).

The problem: Most companies only have a few active listings, and old ones disappear from LinkedIn and career pages.

Paste the company's careers page URL (not an individual job listing) into the Wayback Machine and you'll find archived snapshots going back years. Browse old job listings that are no longer live and collect the technologies mentioned.

Once you've collected 8–10 postings across different roles:

Paste all job descriptions into an LLM and ask:
"Extract every specific technology, tool, framework, or platform 
mentioned across these job descriptions, count how many postings 
each appears in, and group them by category."

Frequency is the signal. If Terraform, Kubernetes, and ArgoCD show up across 6 out of 8 engineering postings, that's the real infra stack. If Jenkins shows up once in a posting from 2021, they've probably moved on.

Don't only look at engineering jobs. A Platform Engineering posting mentioning Backstage tells you they're building an internal developer platform. An SRE posting mentioning PagerDuty, Prometheus, and Grafana tells you their observability stack. A Data Engineering posting mentioning dbt, Airflow, and Snowflake tells you their data warehouse setup.

Different techniques reveal different layers of the stack. Here's a practical workflow depending on what you're trying to find:

"What cloud/infra do they run?"
→ Subdomain enumeration → dig A records to cloud IP ranges → curl API headers

"What does their observability stack look like?"
→ Extract third-party domains from network requests → Job postings (SRE/DevOps roles) → DNS TXT records for tools like Datadog, New Relic, Honeycomb

"What's their data/ML infrastructure?"
→ GitHub repos (look for Airflow DAGs, dbt models, Spark configs) → Hugging Face org → Job postings for data engineers

"What does their auth/security stack look like?"
→ DNS TXT records (okta-domain-verification, onelogin-domain-verification) → Subprocessor lists → Status page incidents mentioning security vendors

"What CI/CD and dev tools do they use?"
→ GitHub Actions workflows in public repos → DNS TXT records (GitHub, Postman, Docker) → Job postings for platform/DevOps roles

# API header fingerprint
curl -sI https://api.company.com/

# DNS TXT records
dig TXT company.com +short

# Subdomain enumeration
subfinder -d company.com -silent | tee subdomains.txt

# Trace subdomain to cloud
dig +short company.com | xargs -I{} curl -s ipinfo.io/{}

# Cisco Umbrella search
curl -O http://s3-us-west-1.amazonaws.com/umbrella-static/top-1m.csv.zip
unzip top-1m.csv.zip && grep -i "company" top-1m.csv

# Browser: extract third-party domains
[...new Set(performance.getEntriesByType("resource").map(r=>{try{return new URL(r.name).hostname}catch{return null}}).filter(Boolean))].filter(h=>!h.includes(location.hostname)).sort()

Every technique here is free. The recon tools (Amass, subfinder) are open-source. The DNS lookups are public by design. The Cisco Umbrella list is published daily. GitHub repos and NPM packages are intentionally public.

I also built an API around these techniques. I released it here in Github if you want to poke around.

If you're lazy, and looking for free/paid tools that do all or any of the above, I compiled a huge list of tech stack lookup tools you can use as an alternative to Wappalyzer or Builtwith.

Templates let you quickly answer FAQs or store snippets for re-use.

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink.

Confirm

For further actions, you may consider blocking this person and/or reporting abuse

We're a place where coders share, stay up-to-date and grow their careers.


Source: Original Publication