How to Validate Emails in Python: API Tutorial with Code

Complete Python tutorial for email validation. Working code examples using API integration, batch processing, and error handling.

How to Validate Emails in Python: API Tutorial with Code

Published February 1, 2026

MS
Max Sterling
February 1, 2026 ยท 9 min read

You have a CSV with 10,000 email addresses. Your boss wants to know which ones are valid before sending a campaign. Regex isn't enough. You need real validation.

This tutorial shows you how to build a Python email validator that checks syntax, domain validity, SMTP responses, and disposable email detection.

We'll start simple and add complexity. By the end, you'll have production-ready code.

What We're Building

A Python script that:

  • Validates single emails via API
  • Processes CSV files in batches
  • Handles rate limits and errors gracefully
  • Exports results with detailed status codes
  • Tracks validation costs

Time to complete: 20-30 minutes
Requirements: Python 3.7+, requests library, API key

Setup

Install dependencies:

pip install requests pandas python-dotenv

Create a .env file for your API key:

EMAIL_VALIDATION_API_KEY=your_key_here

Never hardcode API keys. Use environment variables. Before diving into code, make sure you understand the difference between email validation and verification.

Method 1: Simple Single Email Validation

Let's start with the basics. Validate one email.

import requests
import os
from dotenv import load_dotenv

load_dotenv()

def validate_email(email):
    """Validate a single email address."""

    api_key = os.getenv('EMAIL_VALIDATION_API_KEY')
    url = 'https://api.emails-wipes.com/v1/validate'

    headers = {
        'Authorization': f'Bearer {api_key}',
        'Content-Type': 'application/json'
    }

    payload = {'email': email}

    try:
        response = requests.post(url, json=payload, headers=headers, timeout=10)
        response.raise_for_status()
        return response.json()
    except requests.exceptions.RequestException as e:
        return {'error': str(e), 'email': email}

# Test it
result = validate_email('[email protected]')
print(result)

This returns a JSON response:

{
  "email": "[email protected]",
  "status": "valid",
  "is_disposable": false,
  "is_role_based": false,
  "domain": "gmail.com",
  "smtp_check": "ok",
  "mx_records": true
}

Simple. But not production-ready. Let's improve it.

Method 2: Batch Processing with CSV

You rarely validate one email. You validate thousands. Let's process a CSV. For best practices on this, see our guide on how to validate emails in bulk.

import pandas as pd
import time

def validate_csv(input_file, output_file):
    """Validate emails from CSV and export results."""

    # Read input CSV
    df = pd.read_csv(input_file)

    # Assuming email column is named 'email'
    if 'email' not in df.columns:
        raise ValueError("CSV must have an 'email' column")

    results = []

    for index, row in df.iterrows():
        email = row['email']

        # Validate
        result = validate_email(email)

        # Add result to list
        results.append({
            'email': email,
            'status': result.get('status', 'error'),
            'is_disposable': result.get('is_disposable', None),
            'is_role_based': result.get('is_role_based', None),
            'error': result.get('error', None)
        })

        # Rate limiting: sleep 0.1 seconds between requests
        time.sleep(0.1)

        # Progress indicator
        if (index + 1) % 100 == 0:
            print(f"Processed {index + 1}/{len(df)} emails")

    # Create results DataFrame
    results_df = pd.DataFrame(results)

    # Merge with original data
    output_df = pd.merge(df, results_df, on='email', how='left')

    # Export
    output_df.to_csv(output_file, index=False)

    print(f"\nValidation complete!")
    print(f"Results saved to {output_file}")

    # Summary stats
    print(f"\nSummary:")
    print(results_df['status'].value_counts())

# Usage
validate_csv('emails.csv', 'validated_emails.csv')

This works but has problems:

  • Slow for large lists (one API call per email)
  • No retry logic if API fails
  • Doesn't handle duplicate emails
  • Basic rate limiting

Let's fix these.

Method 3: Production-Ready Validator

This version includes error handling, retries, deduplication, and proper rate limiting.

import requests
import pandas as pd
import time
import os
from dotenv import load_dotenv
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

load_dotenv()

class EmailValidator:
    def __init__(self, api_key=None):
        self.api_key = api_key or os.getenv('EMAIL_VALIDATION_API_KEY')
        self.base_url = 'https://api.emails-wipes.com/v1'
        self.session = self._create_session()
        self.validation_count = 0

    def _create_session(self):
        """Create requests session with retry logic."""
        session = requests.Session()

        retry_strategy = Retry(
            total=3,
            backoff_factor=1,
            status_forcelist=[429, 500, 502, 503, 504]
        )

        adapter = HTTPAdapter(max_retries=retry_strategy)
        session.mount("http://", adapter)
        session.mount("https://", adapter)

        return session

    def validate_single(self, email):
        """Validate a single email address."""

        headers = {
            'Authorization': f'Bearer {self.api_key}',
            'Content-Type': 'application/json'
        }

        payload = {'email': email.strip().lower()}

        try:
            response = self.session.post(
                f'{self.base_url}/validate',
                json=payload,
                headers=headers,
                timeout=10
            )

            response.raise_for_status()
            self.validation_count += 1

            return response.json()

        except requests.exceptions.HTTPError as e:
            if response.status_code == 429:
                return {'error': 'rate_limit', 'email': email}
            return {'error': f'http_{response.status_code}', 'email': email}

        except requests.exceptions.RequestException as e:
            return {'error': 'network_error', 'email': email}

    def validate_batch(self, emails, rate_limit=10):
        """
        Validate multiple emails with rate limiting.

        Args:
            emails: List of email addresses
            rate_limit: Max requests per second (default: 10)
        """

        results = []
        delay = 1.0 / rate_limit  # Time between requests

        for i, email in enumerate(emails):
            result = self.validate_single(email)
            results.append(result)

            # Progress
            if (i + 1) % 100 == 0:
                print(f"Validated {i + 1}/{len(emails)} emails")

            # Rate limiting
            time.sleep(delay)

        return results

    def validate_csv(self, input_file, output_file, email_column='email'):
        """
        Validate emails from CSV file.

        Args:
            input_file: Path to input CSV
            output_file: Path to output CSV
            email_column: Name of email column (default: 'email')
        """

        print(f"Reading {input_file}...")
        df = pd.read_csv(input_file)

        if email_column not in df.columns:
            raise ValueError(f"Column '{email_column}' not found in CSV")

        # Deduplicate emails
        original_count = len(df)
        df_dedup = df.drop_duplicates(subset=[email_column])
        dedup_count = len(df_dedup)

        if original_count != dedup_count:
            print(f"Removed {original_count - dedup_count} duplicate emails")

        # Extract unique emails
        emails = df_dedup[email_column].tolist()

        print(f"Validating {len(emails)} unique emails...")

        # Validate
        results = self.validate_batch(emails)

        # Create results DataFrame
        results_df = pd.DataFrame(results)

        # Merge with original data (deduplicated)
        output_df = pd.merge(
            df_dedup, 
            results_df, 
            left_on=email_column,
            right_on='email',
            how='left'
        )

        # Export
        output_df.to_csv(output_file, index=False)

        # Statistics
        self._print_summary(results_df)

        print(f"\nResults saved to: {output_file}")
        print(f"Total validations performed: {self.validation_count}")

        return output_df

    def _print_summary(self, results_df):
        """Print validation summary statistics."""

        print("\n" + "="*50)
        print("VALIDATION SUMMARY")
        print("="*50)

        # Status breakdown
        print("\nStatus Breakdown:")
        print(results_df['status'].value_counts())

        # Calculate percentages
        total = len(results_df)
        valid = len(results_df[results_df['status'] == 'valid'])
        invalid = len(results_df[results_df['status'] == 'invalid'])

        print(f"\nValid: {valid}/{total} ({valid/total*100:.1f}%)")
        print(f"Invalid: {invalid}/{total} ({invalid/total*100:.1f}%)")

        # Disposable emails
        if 'is_disposable' in results_df.columns:
            disposable = results_df['is_disposable'].sum()
            print(f"Disposable: {disposable} ({disposable/total*100:.1f}%)")

        # Role-based emails (learn more: /blog/role-based-email-addresses-guide.html)
        if 'is_role_based' in results_df.columns:
            role_based = results_df['is_role_based'].sum()
            print(f"Role-based: {role_based} ({role_based/total*100:.1f}%)")

# Usage example
if __name__ == "__main__":
    validator = EmailValidator()

    # Validate single email
    result = validator.validate_single('[email protected]')
    print(result)

    # Validate CSV
    validator.validate_csv(
        input_file='contacts.csv',
        output_file='contacts_validated.csv',
        email_column='email'
    )

This is production-ready. It handles:

  • Retries: Automatic retry on network errors or rate limits
  • Deduplication: Doesn't waste API calls on duplicate emails
  • Rate limiting: Configurable requests per second
  • Progress tracking: Shows validation progress
  • Error handling: Gracefully handles API failures
  • Summary stats: Shows results breakdown

Advanced: Filtering and Segmentation

Once validated, you can filter by criteria:

def filter_results(input_file, output_file, criteria):
    """
    Filter validated emails by criteria.

    Args:
        input_file: Validated CSV file
        output_file: Filtered output file
        criteria: Dict of filtering rules
    """

    df = pd.read_csv(input_file)

    # Example filters
    filtered = df.copy()

    if criteria.get('valid_only'):
        filtered = filtered[filtered['status'] == 'valid']

    if criteria.get('exclude_disposable'):
        filtered = filtered[filtered['is_disposable'] == False]

    if criteria.get('exclude_role_based'):
        filtered = filtered[filtered['is_role_based'] == False]

    filtered.to_csv(output_file, index=False)

    print(f"Filtered {len(filtered)}/{len(df)} emails")

    return filtered

# Usage
filter_results(
    'contacts_validated.csv',
    'contacts_clean.csv',
    criteria={
        'valid_only': True,
        'exclude_disposable': True,
        'exclude_role_based': False
    }
)

Handling Large Files (100K+ Emails)

For very large files, process in chunks to avoid memory issues. If you're also validating email formats with regex before API calls, check out our email regex patterns guide.

def validate_large_csv(input_file, output_file, chunk_size=1000):
    """Validate large CSV files in chunks."""

    validator = EmailValidator()

    # Process in chunks
    chunks = pd.read_csv(input_file, chunksize=chunk_size)

    first_chunk = True

    for i, chunk in enumerate(chunks):
        print(f"\nProcessing chunk {i+1}...")

        emails = chunk['email'].tolist()
        results = validator.validate_batch(emails)

        results_df = pd.DataFrame(results)
        output_chunk = pd.merge(chunk, results_df, on='email', how='left')

        # Write to file (append after first chunk)
        mode = 'w' if first_chunk else 'a'
        header = first_chunk

        output_chunk.to_csv(output_file, mode=mode, header=header, index=False)

        first_chunk = False

    print(f"\nValidation complete! Results in {output_file}")

Cost Tracking

Track validation costs in real-time:

class CostTracker:
    def __init__(self, cost_per_validation=0.005):
        self.cost_per_validation = cost_per_validation
        self.total_validations = 0

    def add_validations(self, count):
        self.total_validations += count

    def get_total_cost(self):
        return self.total_validations * self.cost_per_validation

    def print_summary(self):
        print(f"\n๐Ÿ’ฐ Cost Summary:")
        print(f"Total validations: {self.total_validations:,}")
        print(f"Cost per validation: ${self.cost_per_validation}")
        print(f"Total cost: ${self.get_total_cost():.2f}")

# Integrate with validator
tracker = CostTracker(cost_per_validation=0.005)
validator = EmailValidator()

results = validator.validate_batch(emails)
tracker.add_validations(len(results))
tracker.print_summary()

Error Handling Best Practices

Things will go wrong. Handle them gracefully:

def safe_validate(email):
    """Validate with comprehensive error handling."""

    try:
        # Validate
        result = validator.validate_single(email)

        # Check for API errors
        if 'error' in result:
            error_type = result['error']

            if error_type == 'rate_limit':
                print(f"Rate limit hit. Waiting 60 seconds...")
                time.sleep(60)
                return safe_validate(email)  # Retry

            elif error_type.startswith('http_'):
                print(f"HTTP error for {email}: {error_type}")
                return {'email': email, 'status': 'error', 'error': error_type}

            else:
                return {'email': email, 'status': 'error', 'error': error_type}

        return result

    except Exception as e:
        print(f"Unexpected error validating {email}: {str(e)}")
        return {'email': email, 'status': 'error', 'error': 'unexpected'}

Testing Your Integration

Before running on production data, test with these emails:

test_emails = [
    '[email protected]',           # Should pass
    '[email protected]',  # Should fail (domain doesn't exist)
    '[email protected]',        # Disposable email
    '[email protected]',           # Role-based email
    'not-an-email',               # Syntax error
    'user@',                      # Incomplete
]

for email in test_emails:
    result = validator.validate_single(email)
    print(f"{email:30} โ†’ {result.get('status')}")

Expected output:

[email protected]                โ†’ valid
[email protected]       โ†’ invalid
[email protected]            โ†’ valid (but is_disposable=true)
[email protected]               โ†’ valid (but is_role_based=true)
not-an-email                   โ†’ invalid
user@                          โ†’ invalid

Common Gotchas

Gotcha #1: Forgetting to strip whitespace
Always email.strip() before validation. Leading/trailing spaces cause syntax errors.

Gotcha #2: Case sensitivity
Normalize to lowercase: email.lower(). Emails are case-insensitive but different cases can cause duplicate issues.

Gotcha #3: Not handling rate limits
APIs have limits. Implement exponential backoff when you hit 429 errors.

Gotcha #4: Ignoring catch-all domains
Some domains accept all addresses. These return "unknown" status. Decide how to handle them (send with caution or skip).

Gotcha #5: Validating the same email twice
Deduplicate before validating to save API calls and money.

Next Steps

You now have a production-ready email validator in Python. Some enhancements to consider:

  • Database integration: Store validation results in PostgreSQL or MySQL
  • Scheduling: Use cron or celery to validate lists automatically
  • Web interface: Build a Flask/Django front-end for non-technical users
  • Batch API: Some validation services offer bulk endpoints that are faster
  • Caching: Cache results for emails you've already validated

The complete code from this tutorial is available as a GitHub Gist (link in comments).

Now go validate some emails!

Get Your API Key

Sign up for free and get 100 daily validations. No credit card required.

Get Started