Screaming Frog Gave Me 10,000 Errors. Here
I ran Screaming Frog on a client's 4,000 page e-commerce site. It came back with 10,247 issues. I stared at that number and immediately wanted to close my laptop and go outside. 10,000 issues. Where do you even start? Turns out, after spending two full days triaging everything, only about 10 of those issues were actually impacting rankings or traffic. The rest were technically "issues" according to the tool, but they were either cosmetic, irrelevant, or so low-priority that fixing them would change nothing. This is the dirty secret of technical SEO tools. They find everything, but they tell you nothing about what matters. Every crawl tool does this. Screaming Frog, Sitebulb, Lumar, Semrush site audit. They all generate massive lists of issues and treat them with roughly equal severity. A missing H1 on your 404 page gets flagged the same way as a noindex tag on your highest-traffic landing page. And if your a developer who doesnt do SEO full time, you have no idea which is which. So you either try to fix everything (impossible), fix nothing (bad), or pick randomly (worse). Here's what those 10,247 issues actually looked like when i categorized them: ~4,000 issues: Images without alt text Most were decorative images, icons, and background elements. Yes technically every image should have an alt attribute. But a decorative divider image missing alt text is not why your traffic is down. The MDN guidelines on alt text actually recommend empty alt attributes for decorative images. ~2,500 issues: Pages with "thin content" The tool flagged every page under 300 words. This included product category pages, tag pages, pagination pages, and utility pages like contact and login. These pages are supposed to be short. Thats not an error. ~1,800 issues: Redirect chains Old URL redirects to another old URL which redirects to the current URL. Technically suboptimal, yes. Practically impactful? Almost never. Google follows up to 10 redirects. A chain of 2-3 is fine. The performance impact is negligible. ~900 issues: Missing meta descriptions For blog posts from 2019 that get zero traffic. Not exactly urgent. ~500 issues: Duplicate title tags Most were pagination pages (Page 2, Page 3, etc.) sharing the same title structure. This is normal and expected. ~300 issues: H1 tag issues Multiple H1s, missing H1s, H1s that were too long. Some of these matter, most don't. Google has repeatedly said they dont care about H1 tag count. ~150 issues: Mixed content warnings HTTP images on HTTPS pages. These should be fixed but they're not urgent SEO issues. And then there were the 10 issues that actually mattered. 1. The homepage canonical pointed to the wrong URL https://example.com had a canonical tag pointing to https://example.com/home. This was splitting link equity on the most important page of the site. Fixing this alone probably had more impact than fixing the other 10,237 issues combined. 2-3. Two high-traffic landing pages had accidental noindex tags Someone added noindex to a page template during a staging deploy and it made it to production. Two pages that drove about 30% of organic traffic were slowly being deindexed. This is the kind of thing that makes you want to add CI checks. 4. The sitemap included 404 pages About 80 URLs in the sitemap were returning 404. This wastes crawl budget and confuses search engines about your site's structure. 5-6. Two critical internal links were broken Not just any internal links. Links from the homepage to the two highest-converting product pages were returning 404 because someone changed the URL slugs. 7. Structured data was invalid on all product pages A template change broke the JSON-LD. Google was ignoring structured data on every product page, which meant no rich snippets in search results. 8. Core Web Vitals failed on mobile for the top 20 pages A new hero image component was serving 3MB images without responsive sizing. This pushed LCP over 4 seconds on mobile. 9. Hreflang tags were misconfigured The site had international versions and the hreflang implementation was pointing to wrong language URLs. This was causing the wrong language version to rank in each country. 10. robots.txt was blocking the /api/ path, which also blocked /api-documentation/ An overly broad robots.txt rule intended to block API endpoints was also blocking the entire API documentation section, which was a major traffic driver. The framework i use now is simple. For every issue a crawl tool finds, ask three questions: Does this page get traffic? If a page gets zero visits, fixing SEO issues on it is pointless. Does this issue prevent indexing or ranking? Noindex, broken canonicals, and crawl blocks prevent indexing. Missing alt text does not. Does this affect user experience? Broken links and slow pages affect UX AND SEO. Missing meta descriptions only affect CTR, and only sometimes. If the answer to all three is no, skip it. Move on. // Simple prioritization framework interface CrawlIssue { url: string; type: string; severity: string; } interface PrioritizedIssue extends CrawlIssue { priority: 'critical' | 'high' | 'medium' | 'low' | 'ignore'; monthlyTraffic: number; } function prioritize(issue: CrawlIssue, traffic: number): PrioritizedIssue { const indexBlockers = ['noindex', 'canonical_error', 'robots_blocked', 'sitemap_404']; const uxImpact = ['broken_link', 'slow_page', 'mobile_fail', 'structured_data_error']; let priority: PrioritizedIssue['priority'] = 'ignore'; if (indexBlockers.includes(issue.type) && traffic > 100) { priority = 'critical'; } else if (indexBlockers.includes(issue.type)) { priority = 'high'; } else if (uxImpact.includes(issue.type) && traffic > 100) { priority = 'high'; } else if (uxImpact.includes(issue.type)) { priority = 'medium'; } else if (traffic > 500) { priority = 'medium'; } else { priority = 'low'; } return { ...issue, priority, monthlyTraffic: traffic }; } Honestly the real issue is that most crawl tools are built to find as many issues as possible. More issues = tool looks more valuable = you keep paying. Nobody sells a tool that says "we only found 10 issues." That doesnt feel like $259/year worth of value. I got tired of this and built SiteCrawlIQ to prioritize issues by traffic impact instead of dumping everything in a flat list. An issue on a page with 10,000 monthly visits is not the same as an issue on a page with 0 visits, and your crawl tool should know the difference. Next time a crawl tool gives you 10,000 issues, dont panic. Export the list, cross-reference with your analytics, and find the 10 that actually move the needle. Then close the other 9,990 and go do something useful.
