From Screaming Frog:
Screaming Frog SEO Spider is a small desktop program you can install on your Mac to spiders websites' links, images, CSS & scripts from an SEO perspective. You can view, analyse and filter the information as it's gathered and updated continuously in the program's user interface. The Screaming Frog SEO Spider allows you to quickly analyse or review a site from an onsite SEO perspective. It's particularly good for analysing medium to large sites where manually checking every page would be extremely labour intensive and where you can easily miss a redirect, meta refresh or duplicate page issue. The spider allows you to export key onsite SEO elements (url, page title, meta descriptions, headings etc.) to Excel so it can easily be used as a base to make SEO recommendations from. A quick summary of some of the data - Errors - Client & server errors (4XX, 5XX) Redirects - (3XX, permanent or temporary) External Links - All followed links and their subsequent status codes URI Issues - Non ASCII characters, underscores, uppercase characters, dynamic URIs, over 115 characters Duplicate Pages - Hash value / MD5checksums lookup for duplicate pages Page Title - Missing, duplicate, over 70 characters, same as h1, multiple Meta Description - Missing, duplicate, over 156 characters, multiple Meta Keyword - Mainly for reference as it's only (barely) used by Yahoo - Missing, duplicate, multiple H1 - Missing, duplicate, over 70 characters, multiple H2 - Missing, duplicate, over 70 characters, multiple Meta Robots - Index, noindex, follow, nofollow, noarchive, nosnippet, noodp, noydir etc Meta Refresh - Including target page and time delay Canonical link element File Size Page depth level Inlinks - All pages linking to a URI Outlinks - All pages a URI links out to Anchor Text - All link text. Alt text from images with links Follow & Nofollow - At link level (true/false) Images - All URIs with the image link & all images from a given page. Images over 100kb, missing alt text, alt text over 100 characters. User-Agent Switcher - Crawl as Googlebot, Bingbot, or Yahoo! Slurp Custom Source Code Search - The spider allows you to find anything you want in the source code of a website. Whether that's analytics code, specific text, or code etc. XML Sitemap Generator - You can create a basic XML sitemap using the SEO spider.
What's new in this version:
- Google Analytics Integration: You can now connect to the Google Analytics API and pull in data directly during a crawl.
- Custom Extraction: The new 'custom extraction' feature allows you to collect any data from the HTML of a URL.
- Other Smaller Updates & bug fixes