Try crawling dmoz.org, being sure to restrict Xenu’s access to “editors.dmoz
Posted: Sun Dec 22, 2024 6:41 am
I mean, once you get your data into Excel the world is literally your oyster. Mmmmmm data oysters. But wait! That's not all - I reached out to Rich Baxter as I know he's a very knowledgeable and smart SEO and he uses Xenu a lot. I asked him if he had any killer tips and here's his killer tip. Thanks a lot Rich for getting me this at short notice: Crawling web directories, looking for errors (By Rich Baxter) Xenu’s not just a great tool to look inside your own site, it’s also pretty powerful for crawling external resources like directories, particularly if you’re looking for a domain to buy.
But allow the crawler to “check external lin france business email list ks”. not-founds Quite quickly you’ll start finding “not found” URL errors from directory entries that might have been forgotten, on domains that may not yet have expired. Just sort by “status” in the crawl results table in Xenu. Here’s one I found earlier. I’m pretty sure that with the right offer via SEDO, the owner of fridgemagnet.
org.uk (with its 634 sub domain links) might be interested in selling before the domain expires. I’ve always found the “Copy URL”, Google cache and Wayback Machine links invaluable on a right mouse click on the results you’re interested in: As a side note: If you are crawling external resources, try to be a good citizen and crawl slowly. Set your maximum threads to a very low level, so as not to get your IP banned by your target host.
But allow the crawler to “check external lin france business email list ks”. not-founds Quite quickly you’ll start finding “not found” URL errors from directory entries that might have been forgotten, on domains that may not yet have expired. Just sort by “status” in the crawl results table in Xenu. Here’s one I found earlier. I’m pretty sure that with the right offer via SEDO, the owner of fridgemagnet.
org.uk (with its 634 sub domain links) might be interested in selling before the domain expires. I’ve always found the “Copy URL”, Google cache and Wayback Machine links invaluable on a right mouse click on the results you’re interested in: As a side note: If you are crawling external resources, try to be a good citizen and crawl slowly. Set your maximum threads to a very low level, so as not to get your IP banned by your target host.