The rules for cold email deliverability changed fundamentally in 2024 and 2025, and enforcement has only tightened since. Google, Yahoo, and now Microsoft all require proper authentication (SPF, DKIM, and DMARC) from bulk senders, enforce a hard spam complaint ceiling, and actively reject non-compliant messages at the SMTP level as of November 2025. What was once seen as best practice is now the infrastructure that determines whether your emails arrive at all.
Keep on reading if you want to learn how to improve your email deliverability. This guide is a precise breakdown of every major deliverability lever for B2B cold outreach.
SPF, DKIM, and DMARC are now table stakes
These 3 protocols form an authentication chain. Each does something different and you need all 3 configured correctly:
SPF (Sender Policy Framework)
SPF is a DNS TXT record that lists every IP address and server authorised to send email on behalf of your domain. When a receiving server gets your email, it checks the Return-Path domain’s SPF record against the connecting IP. Critically, SPF checks the hidden envelope sender, not the visible “From:” address the recipient sees.
A typical SPF record for a cold outreach domain looks like: v=spf1 include:_spf.google.com include:spf.coldemailtool.com ~all. The ~all (soft fail) tells receivers that unlisted IPs are probably unauthorized, but emails should still be accepted. Use ~all on sending domains and -all (hard fail) only on parked domains. Hard fail on a sending domain can cause receivers to reject at the SMTP level before DKIM is evaluated, bypassing DMARC’s designed fallback logic.
The critical SPF constraint is the 10 DNS lookup limit (RFC 7208). Each include, a, mx, or redirect mechanism counts toward this cap. If you are running Google Workspace plus a CRM, a cold email platform, and a warm-up tool, you can easily breach this limit. Exceeding it returns a PermError, which DMARC interprets as a fail. Roughly 20% of SPF records are broken this way. The fix is subdomain segmentation. Move different sending services to subdomains (e.g., outreach.company.com), each with its own SPF record and independent lookup budget. DMARC still passes under relaxed alignment since the organizational domain matches. Notably, dmarcian – the company that originally invented SPF flattening – now actively discourages it, because provider IP changes cause silent validation failures.
DKIM (DomainKeys Identified Mail)
DKIM uses RSA public-key cryptography to prove message integrity. Your sending server generates a hash of specified email headers and body content, signs it with a private key, and attaches the signature in the DKIM-Signature header. The receiving server queries your DNS for the public key at selector._domainkey.yourdomain.com and verifies the signature. If anything was altered in transit, verification fails. The consensus recommendation for 2025-2026 is 2048-bit RSA keys – 1024-bit still functions but is increasingly considered inadequate, and NIST recommends 2048-bit as the minimum. Rotate keys every 6-12 months using selector-based transitions.
DKIM’s critical advantage over SPF is that it survives email forwarding. SPF breaks when a message is forwarded because the forwarding server’s IP isn’t in the original sender’s SPF record. DKIM’s cryptographic signature is embedded in the message headers and travels with the email regardless of which server delivers it. This is why DKIM effectively carries DMARC authentication for forwarded messages, and why configuring custom DKIM signing with your own domain (not your ESP’s domain) is essential for alignment.
DMARC (Domain-based Message Authentication, Reporting & Conformance)
DMARC ties SPF and DKIM together by adding alignment verification: it checks that the domain in at least one passing authentication result (SPF or DKIM) matches the domain in the visible From: header. This closes the gap where SPF and DKIM alone don’t verify the address recipients actually see. DMARC also provides policy enforcement and reporting. The 3 policy levels work as follows:
p=none: No action taken on failing emails. Messages deliver normally. You receive aggregate reports showing authentication results. This is purely a monitoring mode with zero spoofing protection.p=quarantine: Failing emails are routed to the recipient’s spam/junk folder. Partial protection – you catch most unauthorised senders while allowing recovery of false positives.p=reject: Failing emails are blocked entirely and never delivered. The strongest anti-spoofing posture, but misconfigured legitimate senders will also be blocked.
The recommended progression is: start at p=none for 30-90 days while collecting DMARC aggregate reports (rua) to identify all legitimate sending sources. Then move to p=quarantine with a gradual percentage ramp (pct=25 → pct=50 → pct=75 → pct=100) over 60-90 days. Finally advance to p=reject with the same percentage ramp. The benchmark for advancing is a 98%+ DMARC compliance rate in your aggregate reports. These reports – sent as daily XML files – show every sending IP, message counts, SPF/DKIM pass/fail results, alignment status, and whether receivers overrode your policy. Use a parsing tool like dmarcian, EasyDMARC, or Postmark’s DMARC Digests to make them readable.
Dedicated domains need weeks of warm-up, not days
Never use your primary business domain for cold outreach. If cold emails trigger spam complaints or blacklisting, the damage bleeds into transactional email, internal communications, and customer messages. Email providers evaluate reputation at the domain level, not the mailbox level, so a separate address on the same domain offers no protection.
Rules of setting up alternative domains
- Choose a cold outreach domain that is visually close to your primary domain—your best option is a different TLD extension (
company.co,company.io) or a branded variant (getcompany.com,trycompany.com) - Prioritise .com domains—exotic TLDs like .xyz or .info carry worse reputational baselines
- Set up a 301 redirect to your primary website so prospects who investigate find a real company
- Before purchasing, check MXToolbox or Spamhaus to verify the domain has no inherited blacklist history
The data on infrastructure sizing is consistent: run 2-3 mailboxes per domain, each sending 30-50 cold emails per day. Google Workspace technically allows 2,000 sends per day, but the practical safe limit for cold outreach is dramatically lower. Managing outbound infrastructure for 400+ clients, we consistently see this play out—our campaigns achieve 95%+ inbox placement, and horizontal domain scaling is a core part of why. Scale horizontally (more mailboxes and domains) rather than pushing individual mailboxes harder.
Warm-up timelines vary by source, but the realistic range is 2-4 weeks minimum and 6-8 weeks for teams planning to send at higher volumes. The daily progression follows a consistent pattern across authoritative sources:
- Days 1-7: 5-10 emails to known contacts who will engage (target 90%+ open rates)
- Days 8-14: 15-25 emails, expanding the recipient circle
- Days 15-21: 25-50 emails, monitoring that bounce rates stay below 2%
- Days 22-30+: Begin introducing cold outreach at 20-30% of daily volume, increasing gradually. The critical rule is to never increase volume by more than 20% in a single day
Warm-up tools like Instantly, Lemwarm, MailReach, and Warmbox operate on peer-to-peer networks: your inbox joins a pool, the tool sends emails between network inboxes, and those inboxes automatically open, reply, and rescue messages from spam.
However, their effectiveness is contested. Independent deliverability consultancy Postbox Services tested most major tools in mid-2025 and found “none demonstrated any meaningful improvements.” Google has actively told warm-up providers their tools violate Terms of Service, and GMass shut down its warm-up service entirely.
The practical consensus: use warm-up tools as one component, not a silver bullet, and validate with independent metrics like Google Postmaster Tools or GlockApps seed tests. Manual warm-up with real contacts who genuinely reply remains the gold standard.
The 0.1% spam rate is where problems start
Google’s February 2024 requirements established 2 tiers for spam complaint rates:
The recommended maximum is 0.1% (1 complaint per 1,000 delivered emails). Exceeding this triggers increased scrutiny and reduced inbox placement.
The hard enforcement ceiling is 0.3% (3 per 1,000). Exceeding 0.3% makes you ineligible for Google’s mitigation programs, and as of November 2025, triggers active permanent rejections with 5xx SMTP error codes. Recovery requires maintaining rates below 0.3% for 7 consecutive days.
Yahoo enforces the same 0.3% threshold. Microsoft adopted parallel standards on 5th May 2025, though it does not publish exact numeric thresholds. Best-in-class B2B teams maintain 0.03-0.06% complaint rates. Across the campaigns we manage, 0.05% is our benchmark.
For bounce rates, the consensus safe ceiling is under 2% for hard bounces. It’s also the most common inherited problem we see when clients come to us—lists that have never been verified or data that has been sitting untouched for 12+ months. Poor list hygiene is usually the first thing we fix.
At 2-3% bounce rate, ISPs begin routing more mail to spam. At 3-5%, active throttling and ESP warnings kick in. Above 5%, expect blocklisting and potential account suspension. Hard bounces (permanently invalid addresses) are significantly more damaging than soft bounces (full mailbox, temporary server issues) because they signal sending to non-existent addresses—a hallmark of spammers. Every hard bounce should be immediately and permanently suppressed.
Email verification before every campaign is non-negotiable for cold outreach. The major tools are:
- ZeroBounce (AI scoring, activity data, ~$16/2,000 emails)
- Hunter.io (integrated finding + verification, from $49/mo)
- NeverBounce (auto-sync with ESP integrations, ~$8/1,000)
- Bouncer (budget-friendly at $8/1,000)
- Emailable (fastest processing, $30/5,000 credits)
All of these email verification tools check syntax, DNS/MX records, SMTP mailbox existence, catch-all detection, disposable email detection, and role-based address identification. Real-world accuracy testing shows meaningful variation across providers, particularly on catch-all domains.
Approximately 30% of B2B email servers are configured as catch-all, meaning they accept all incoming email regardless of whether the mailbox exists. Standard SMTP verification cannot distinguish valid from invalid addresses on these domains, creating a significant blind spot. Segment catch-all addresses separately and monitor engagement closely.
Email databases decay at roughly 25% per year, making regular verification of your lists essential.
Volume spikes are the number-one red flag for inbox providers
Sending patterns matter as much as content—going from zero to hundreds of emails in a day is the single most reliable trigger for spam filtering across all major providers.
Your sending cadence should follow 2-5 minute randomised intervals between emails, distributed across business hours in the recipient’s timezone. Fixed-time bulk sends create velocity spikes that pattern-matching algorithms flag as automation, so always vary intervals randomly (e.g., between 2 and 7 minutes rather than exactly every 3 minutes), randomise send times within a window, and maintain some weekend sending activity at reduced volume to avoid the appearance of complete stoppage.
Plain text outperforms HTML for cold outreach deliverability
We do not send HTML emails for any client campaign. Every email we send looks like something a human typed to a colleague, because that is exactly what inbox providers reward.
Recent analysis of 2.2 million emails found HTML cold emails bounced 674% more than plain text. The structural reason is straightforward: HTML-heavy emails pattern-match to marketing blasts, and inbox providers apply stricter bulk-sender filtering when that pattern comes from a personal-style mailbox. Tracking pixels (invisible 1×1 images for open tracking), redirect-based link tracking, heavy images, and complex HTML formatting all increase spam filter scrutiny.
The 2026 recommendation to improve email deliverability is straightforward: use plain text for all cold outreach.
Disable open tracking entirely. If you must include a link, limit to 1 (your website or calendar booking page) and avoid tracked redirects. And keep emails under 80 words with a single call-to-action.
Google Postmaster Tools shows you exactly what Gmail thinks of your domain
Google Postmaster Tools (postmaster.google.com) provides the only direct window into how Gmail evaluates your sending. Setup requires adding a DNS TXT verification record to your domain. After verification, data typically appears within 24-48 hours, though you need roughly 100+ daily messages to Gmail recipients for metrics to populate reliably. The tool shows data only for messages to @gmail.com and @googlemail.com, not Google Workspace corporate accounts, Yahoo, Outlook, or any other provider.
The current v2 interface (v1 was retired in late 2025) centres on several dashboards.
The Spam Rate dashboard is the single most critical metric; it shows the percentage of messages delivered to the inbox that recipients subsequently marked as spam, with visual threshold lines at 0.1% and 0.3%.
The Compliance Status dashboard, added in mid-2024, provides a binary pass/fail check against Google’s sender guidelines covering authentication, DNS records, encryption, spam rate, and (for bulk senders) DMARC and one-click unsubscribe.
The Authentication dashboard displays pass rates for SPF, DKIM, and DMARC.
Delivery Errors tracks the percentage of authenticated messages that were rejected or temporarily failed, with reason codes including suspected spam, DMARC policy issues, low domain reputation, IP blacklisting, and missing PTR records.
There is also a Feedback Loop dashboard for senders who have configured feedback IDs, and an Encryption dashboard showing TLS adoption rates.
Key limitations: Data has a 24-48 hour lag, aggregated ratings don’t offer per-campaign granularity, low-volume days may show no data due to privacy suppression, and if Gmail is already routing your mail to spam, the user-reported spam rate can appear artificially low since fewer messages reach the inbox for recipients to report.
For Microsoft, the parallel monitoring tool is SNDS (Smart Network Data Services), which provides IP-based (not domain-based) reputation with a simple Green/Yellow/Red system. Microsoft’s system is intentionally more opaque and covers only consumer domains.
What Google, Yahoo and Microsoft changed (and what it means for you)
The February 2024 Google and Yahoo requirements created a 2-tier system. All senders to personal Gmail must have SPF or DKIM, valid forward/reverse DNS (PTR records), TLS encryption, and maintain spam rates below 0.3%.
Bulk senders (5,000+ messages per day to personal Gmail) must additionally have SPF and DKIM, DMARC with at least p=none, DMARC alignment, and one-click unsubscribe via RFC 8058 headers for marketing messages. A critical December 2023 clarification narrowed scope to personal Gmail accounts only, excluding Google Workspace corporate accounts. Yahoo announced parallel requirements simultaneously without specifying a numeric bulk sender threshold. Microsoft followed on May 5, 2025, requiring SPF, DKIM, and DMARC (minimum p=none) from domains sending 5,000+ daily emails to outlook.com, hotmail.com, and live.com.
Google’s enforcement escalated through defined phases: warnings and temporary 4xx errors in February 2024, permanent 5xx rejections beginning April 2024, one-click unsubscribe enforcement in June 2024, and full-scale permanent rejections from November 2025 onward. Non-compliant emails now receive specific SMTP error codes – 550-5.7.26 for authentication failure, 550-5.7.1 for spam rate violations.
You are unlikely to hit the 5,000/day threshold to personal Gmail if your primary targets are corporate/Workspace accounts. However, all-sender requirements (authentication, DNS, spam rates) still apply at any volume. Google Workspace accounts use sophisticated filtering shaped by the same deliverability signals. And the industry consensus is clear: what’s required for bulk senders today will become required for all senders in the near future. Google’s original draft included Workspace accounts, and every major cold email platform now treats SPF, DKIM, and DMARC as baseline requirements.
Trigger words still contribute to the score
Gmail’s spam filter is an ML/AI-driven behavioural analysis system, not a simple keyword blocker. It simultaneously evaluates sender reputation, engagement history, sending patterns, message structure, content-sender relationship, and word patterns.
A single use of “free” won’t trigger filtering; “FREE FREE FREE Trial!!!” combined with poor sender reputation and mass-blast patterns will. That said, specific word patterns remain part of the overall scoring, particularly when combined with other negative signals.
High-risk categories for B2B cold email include urgency/pressure language (“Act now,” “Limited time,” “Don’t miss out”), financial claims (“Guaranteed income,” “100% free”), overpromising (“Risk-free,” “No obligation,” “Guaranteed”), and aggressive CTAs (“Buy now,” “Click here,” “Sign up free”). Cold-email-specific triggers include overused templates (“I’d love to get on a quick call”), misleading “Re:” or “Fwd:” subject prefixes that imply prior conversation, and generic greetings that signal bulk sending. An often-cited statistic: 69% of recipients report emails as spam based solely on the subject line.
Formatting patterns matter as much as words. URL shorteners (bit.ly, tinyurl) are strongly flagged as potential phishing vectors. Multiple links in a single cold email raise suspicion. Your best move is zero links in the first cold email and a maximum of 1 in follow-ups. Open tracking pixels now trigger Gmail warning labels visible to recipients. Excessive capitalisation, exclamation marks, coloured fonts, and complex HTML signatures all contribute to spam scoring.
The overall direction is clear: emails that look like something a human typed to a colleague perform best. Emails that look like they came from a marketing automation system trigger stricter evaluation, regardless of the specific words used.
Final thoughts on improving email deliverability
The state of play as of early 2026 has consolidated around a clear set of technical requirements and behavioural expectations. Authentication (SPF + DKIM + DMARC) is no longer optional at any volume – it is enforced infrastructure. The 0.1% spam complaint rate is the practical ceiling for sustainable cold outreach, with 0.3% representing an unrecoverable threshold. Domain warm-up requires genuine patience: 2-4 weeks minimum with conservative daily volume ramps and horizontal scaling across multiple domains and mailboxes.
Plain text has decisively won the deliverability argument for cold outreach. And Google Postmaster Tools, despite its Gmail-only limitation, remains the only direct feedback channel showing how the world’s largest email provider evaluates your sending.
The most important strategic insight is that all 3 major providers (Google, Yahoo, and Microsoft) have now aligned on nearly identical authentication and complaint-rate standards. What was “best practice” in 2023 is now actively enforced with permanent message rejection in 2026.
The trajectory points in only one direction: tighter enforcement, broader scope, and less tolerance for undisciplined sending. Building the technical foundation correctly now is not overcautious. It is the minimum viable infrastructure for deliverability, and it is the foundation every campaign we run is built on.
If you want to know how your current outbound setup measures up against these benchmarks, run it through our free outbound assessment tool.