Google Dorking on .gov.uk
Google Dorking (advanced search operators) is a powerful method for mapping attack surfaces, discovering forgotten subdomains, and finding leaking documents containing sensitive metadata across the UK government's digital estate.
Mapping the Perimeter
By default, most users interact with www.gov.uk. However, hundreds of undocumented local, regional, and staging subdomains exist. By using the exclusion operator, we can filter out main portals and find hidden infrastructure.
This helps identify legacy platforms, internal dashboards that have accidentally been exposed to search engine spiders, and localized services.
Extracting Hidden Metadata
Government organizations publish thousands of office documents annually. Older formats (like .doc, .xls, .ppt) contain EXIF metadata including:
- Author name & organization structure
- Local filepath templates (which reveal internal active directory usernames)
- Software suites & OS build versions
- Internal network printer paths
Brute-forcing Sequential Assets (e.g., Cafcass Case Study)
A common design pattern on CMS setups is hosting media attachments inside directories structured with auto-incrementing numerical IDs. A classic example is Cafcass (Children and Family Court Advisory and Support Service).
When file pathways follow structures like:
https://www.cafcass.gov.uk/media/1042/
an analyst or attacker can easily automate a crawler to cycle from 1 to 10000. This uncovers hidden PDFs, drafts, or media attachments that were uploaded to the backend but never linked on public navigation pages.
Google Dorks can be used to query these media directories directly. If the search engine has crawled them, we can view their listing even without direct links.
Directory Listing Indexes
When web servers do not disable directory indexing, requesting a folder path returns an HTML index of files. Using dorks like intitle:"index of" or searching for directories that contain files helps threat researchers discover directory structures.