Skip to content Skip to footer
0 items - $0.00 0

Ask HN: How did the internet discover my subdomain? by govideo

Ask HN: How did the internet discover my subdomain? by govideo

33 Comments

  • Post Author
    codingdave
    Posted March 6, 2025 at 10:39 pm

    If it is on DNS, it is discoverable. Even if it were not, the message you pasted says outright that they scan the entire IP space, so they could be hitting your server's IP without having a clue there is a subdomain serving your stuff from it.

  • Post Author
    Kikawala
    Posted March 6, 2025 at 10:39 pm

    Is it available under HTTPS? Then it's probably in a Certificate Transparency log.

  • Post Author
    daggersandscars
    Posted March 6, 2025 at 10:48 pm

    DNS query type AXFR allows for subdomain querying. There are security restrictions around who can do it on what DNS servers. Given the number of places online one can run a subdomain query, I suspect it's mostly a matter of paying the right fees to the right DNS provider.

  • Post Author
    artursapek
    Posted March 7, 2025 at 12:00 am

    presumably it has a DNS record

  • Post Author
    vince14
    Posted March 7, 2025 at 12:06 am

    I'm having the same issue.

    https://securitytrails.com/ also had my "secret" staging subdomain.

    I made a catch-all certificate, so the subdomain didn't show up in CT logs.

    It's still a secret to me how my subdomain ended up in their database.

  • Post Author
    parliament32
    Posted March 7, 2025 at 12:15 am

    Certificate Transparency logs, or they don't actually know the domain name: just port-scanning[1] then making requests to open web ports.

    [1] Turns out you can port-scan the entire internet in under 5 minutes: https://github.com/robertdavidgraham/masscan

  • Post Author
    fsckboy
    Posted March 7, 2025 at 12:26 am

    LPT, this is an object lesson in the weakness of security through obscurity

  • Post Author
    OuterVale
    Posted March 7, 2025 at 12:27 am
  • Post Author
    8bitchemistry
    Posted March 7, 2025 at 12:29 am

    Did you ever email the URL to somebody?
    We had the same issue years ago where google seemed to be crawling/indexing new subdomains it finds in emails.

  • Post Author
    andix
    Posted March 7, 2025 at 12:35 am

    I'm surprised nobody mentioned subfinder yet: https://github.com/projectdiscovery/subfinder

    Subfinder uses different public and private sources to discover subdomains. Certificate Transparency logs are a great source, but it also has some other options.

  • Post Author
    spl757
    Posted March 7, 2025 at 12:41 am

    Does the IP address for that subdomain have a DNS PTR record set? If it does, someone can discover the subdomain by querying the PTR record for the IP.

  • Post Author
    andix
    Posted March 7, 2025 at 12:44 am

    If a HTTPS service should be hard to discover, an easy way is to hide it behind a subdirectory. Something like https://subdomain.domain.example/hard_to_find_secret_string.

    Another option are wildcard certificates.

    This obviously can't be the only protection. But if an attacker doesn't know about a service, or misses it during discovery, they can't attack it.

  • Post Author
    LinuxBender
    Posted March 7, 2025 at 1:33 am

    As others have said, likely cert transparency logs. Use a wildcard cert to avoid this. They are free using LetsEncrypt and possibly a couple other ACME providers. I have loads of wildcard certs. Bots will try guessing names but like you I do not use easily guessable names and the bots never find them. I log all DNS answers. I assume cloudflare supports strict-SNI but no idea if they have their own automation around wildcard certs. Sometimes I renew wildcard certs I am not even using just to give the bots something to do.

  • Post Author
    pabs3
    Posted March 7, 2025 at 2:35 am

    ArchiveTeam has some docs about this:

    https://wiki.archiveteam.org/index.php/Finding_subdomains

  • Post Author
    ciaovietnam
    Posted March 7, 2025 at 3:08 am

    There is a chance that your subdomain is the first/default virtual host in your web server setup (or the subdomain's access log is the default log file) so any requests to the server's IP address get logged to this virtual host. That means they didn't access your subdomain, they accessed via your server IP address but got logged in your subdomain's access log.

  • Post Author
    alberth
    Posted March 7, 2025 at 3:29 am

    This site will find any subdomain, for any domain, so long as it previously had a certificate (ssl/tls)

    https://crt.sh/

  • Post Author
    thedougd
    Posted March 7, 2025 at 5:23 am

    Some CAs (Amazon) allow not publishing to the Certificate Transparency Log. But if you do this, browsers will block the connection by default. Chromium browsers have a policy option to skip this check for selected URLs. See: CertificateTransparencyEnforcementDisabledForURLs.

    Some may find this more desirable than wildcard certificates and their drawbacks.

  • Post Author
    rempargo
    Posted March 7, 2025 at 5:31 am

    I assume you host this with a https certificate, so you can look your subdomains at:

    https://crt.sh/?q=sampledomain.com

  • Post Author
    melson
    Posted March 7, 2025 at 5:41 am

    Someone might used open-source tool like sublist3r

  • Post Author
    paxys
    Posted March 7, 2025 at 5:55 am

    Not sure why everyone is going on about certificate transparency logs when the answer is right there in the user agent. The company is scanning the ipv4 space and came upon your IP and port.

  • Post Author
    DeborahMatthews
    Posted March 7, 2025 at 7:00 am

    [dead]

  • Post Author
    arkfil
    Posted March 7, 2025 at 7:21 am

    paloAlto (network devices like firewalls etc) is able to scan the sites that users want to visit behind their devices. these are very popular devices in many companies. users can also have agents installed on their computers that also have access to the sites they visit.

  • Post Author
    govideo
    Posted March 7, 2025 at 7:51 am

    Thanks for everyone's perspectives. Very educational and admittedly lots outside the boundaries of my current knowledge. I have thus far relied on CloudFlare's automatic https and simple instant subdomain setup for their worker microservice I'm using.

    There are evidently technical/footprint implications of that convenience. Fortunately, I'm not really concerned with the subdomain being publicly known; was more curious how it become publicly known.

  • Post Author
    bashwizard
    Posted March 7, 2025 at 8:00 am

    Like people have said already; Certificate Transparency logs.

    There are countless of tools to use for subdomain enumeration. I personally use subfinder or amass when doing recon on bug bounty targets.

  • Post Author
    3oil3
    Posted March 7, 2025 at 9:03 am

    What happens if you google your subdomain?
    Maybe the bots have some sort of dictionary files and they just run them, and when there is a match, then they append it with some .html extension, or maybe they prepend it to the match as a subdomain of it?

  • Post Author
    f4c39012
    Posted March 7, 2025 at 9:26 am

    CSP headers can leak urls, but I assume that isn't the cause here if the subdomain is an entirely separate project

  • Post Author
    ThePowerOfFuet
    Posted March 7, 2025 at 10:23 am

    Others are saying CT logs but my own subdomains are on wildcard certificates, in which case I suspect they are discovered by DPI analysis of DNS traffic and resold, such as by Team Cymru.

  • Post Author
    BLKNSLVR
    Posted March 7, 2025 at 10:26 am

    There are a number of companies, not just Palo Alto Networks, that perform various different scales of scans of the entire IPv4 space, some of them perform these scans multiple times per day.

    I setup a set of scripts to log all "uninvited activity" to a couple of my systems, from which I discovered a whole bunch of these scanner "security" companies. Personally, I treat them all as malicious.

    There are also services that track Newly Registered Domains (NRDs).

    Tangentially:

    NRD lists are useful for DNS block lists since a large number of NRDs are used for short term scam sites.

    My little, very amateur, project to block them can be found here:
    https://github.com/UninvitedActivity/UninvitedActivity

    Edited to add:
    Direct link to the list of scanner IP addresses (although hasn't been updated in 8 months – crikey, I've been busy longer than I thought):
    https://github.com/UninvitedActivity/UninvitedActivity/blob/…

  • Post Author
    lockhead
    Posted March 7, 2025 at 10:28 am

    Most likely passive DNS data, if you use your subdomain you do DNS queries for it. If you use a DNS server to resolve your domains that shares this data, it can be picked up by others.

  • Post Author
    nusl
    Posted March 7, 2025 at 10:40 am

    It's pretty common to bruteforce subdomains of a domain you might be interested in, specially by attackers.

  • Post Author
    xg15
    Posted March 7, 2025 at 10:56 am

    TIL (from this thread) : You can abuse TLS handshakes to effectively reverse-DNS an IP address without ever talking to a DNS server! Is this built into dig yet? :)

    (Alright, some IP addresses, not all of them)

    I also wonder if this is a potential footgun for eSNI deployments: If you add eSNI support to a server, you must remember to also make regular SNI mandatory – otherwise, an eavesdropper can just ask your server nicely for the domain that the eSNI encryption was trying to hide from it.

  • Post Author
    _trampeltier
    Posted March 7, 2025 at 11:01 am

    Did you send a link over Email, Whatsapp or something like?

  • Post Author
    ralferoo
    Posted March 7, 2025 at 11:32 am

    If you're using HTTPS, then you're probably using letsencrypt and so your subdomain will appear on the CT logs that are publicly accessible.

    One thing you could do is use a wildcard certificate, and then use a non-obvious subdomain from that. I actually have something similar – in my set up, all my web-traffic goes to haproxy frontends which forward traffic to the appropriate backend, and I was sick of setting up multiple new certificates for each new subdomain, so I just replaced them all with a single wildcard cert instead. This means that I'm not advertising each new subdomain on the CT list, and even though they all look nominally the same when visiting – same holding page on index and same /api handling, just one of the subdomains decodes an additional URL path that provides access to status monitoring.

    Separately, that Palo Alto Networks company is a real pain. They connect to absolutely everything in their attempts to spam the internet. Frankly, I'm sick of even my mail servers being bombarded with HTTP requests on port 25 and the resultant log spam.

Leave a comment

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Sign Up to Our Newsletter

Be the first to know the latest updates

Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.