Error Handling

✅ Module 1: Basics of Error Handling in Python

๐Ÿ”น 1.1 Common Types of Errors in Python

These are runtime errors that commonly occur:

Error Type Description Example
SyntaxError Invalid Python syntax if x = 5 (should be ==)
NameError Variable not defined print(x) when x is undefined
TypeError Operation on incompatible types "2" + 5
ValueError Function receives inappropriate value int("hello")
ZeroDivisionError Division by zero 5 / 0
IndexError Index out of range arr[10]

๐Ÿ”น 1.2 Basic try-except Block

try:
    result = 10 / 0
except ZeroDivisionError:
    print("Cannot divide by zero!")

๐Ÿง  Explanation: This prevents the program from crashing by catching the ZeroDivisionError.


๐Ÿ”น 1.3 Catching Multiple Exceptions

try:
    value = int("abc")
    print(10 / 0)
except ValueError:
    print("Conversion failed!")
except ZeroDivisionError:
    print("Division by zero!")

You can also combine them:

try:
    # some risky code
    pass
except (ValueError, ZeroDivisionError) as e:
    print(f"Error occurred: {e}")

๐Ÿ”น 1.4 Using else and finally

try:
    result = 10 / 2
except ZeroDivisionError:
    print("Division error!")
else:
    print("No error occurred:", result)
finally:
    print("This will always run.")

๐Ÿ”น 1.5 Raising Exceptions

You can manually raise errors:

def divide(a, b):
    if b == 0:
        raise ValueError("b cannot be zero.")
    return a / b

print(divide(10, 2))

๐Ÿ”น 1.6 Creating Custom Exceptions

class MyCustomError(Exception):
    pass

def check_value(x):
    if x < 0:
        raise MyCustomError("Negative values not allowed!")

check_value(-1)


✅ Module 2: Advanced Exception Handling

This module focuses on writing more robust, scalable, and clean error-handling code — useful in larger apps or production environments.


๐Ÿ”น 2.1 Nested Try-Except Blocks

You can have try-except blocks inside other try blocks.

try:
    a = int(input("Enter number: "))
    try:
        result = 10 / a
        print(result)
    except ZeroDivisionError:
        print("Inner: Division by zero")
except ValueError:
    print("Outer: Invalid input")

๐Ÿง  Why: Helps isolate and handle errors in specific code sections.


๐Ÿ”น 2.2 Catching Multiple Exceptions with as

try:
    # risky code
    pass
except (TypeError, ValueError) as e:
    print(f"An error occurred: {e}")

๐Ÿง  Tip: as e gives access to the original exception object for logging or debugging.


๐Ÿ”น 2.3 Exception Chaining

Useful to raise a new exception while preserving the original one.

try:
    1 / 0
except ZeroDivisionError as e:
    raise ValueError("Invalid math operation") from e

๐Ÿง  Why: Maintains traceback of both exceptions — useful for debugging.


๐Ÿ”น 2.4 Logging Exceptions (Instead of Printing)

import logging

logging.basicConfig(level=logging.ERROR)

try:
    1 / 0
except ZeroDivisionError as e:
    logging.error("Division error occurred", exc_info=True)

๐Ÿง  Why logging?

  • print() is fine for small scripts.

  • logging is ideal for production, debugging, and persistent logs.


๐Ÿ”น 2.5 Best Practices for Exception Handling

✅ Do:

  • Handle specific exceptions (ValueError, not just Exception)

  • Keep try blocks small

  • Use logging instead of print

  • Document custom exceptions

  • Catch exceptions only when you can handle or report them

❌ Avoid:

  • Catching broad Exception unless at the top level

  • Swallowing errors silently

  • Overusing nested try blocks


✅ Module 3: Error Handling in File I/O and APIs


๐Ÿ”น 3.1 Handling File I/O Errors

When working with files, errors like missing files or permission issues are common.

try:
    with open("data.csv", "r") as file:
        content = file.read()
except FileNotFoundError:
    print("The file does not exist.")
except PermissionError:
    print("Permission denied.")
except Exception as e:
    print(f"Unexpected error: {e}")

๐Ÿง  Best Practice: Use with open(...) to auto-close files.


๐Ÿ”น 3.2 Reading CSV/JSON Safely

import csv

try:
    with open("data.csv", newline='') as f:
        reader = csv.reader(f)
        for row in reader:
            print(row)
except Exception as e:
    print("Error reading CSV:", e)
import json

try:
    with open("data.json") as f:
        data = json.load(f)
except json.JSONDecodeError:
    print("JSON is malformed")

๐Ÿ”น 3.3 Writing Files with Care

try:
    with open("output.txt", "w") as f:
        f.write("Sample output")
except IOError as e:
    print("Write error:", e)

๐Ÿง  Always handle IOError when writing files (e.g., disk full, permission denied).


๐Ÿ”น 3.4 Error Handling with APIs using requests

import requests

try:
    response = requests.get("https://api.example.com/data", timeout=5)
    response.raise_for_status()  # Raises HTTPError for bad responses
    data = response.json()
except requests.exceptions.HTTPError as errh:
    print("HTTP error:", errh)
except requests.exceptions.ConnectionError as errc:
    print("Connection error:", errc)
except requests.exceptions.Timeout as errt:
    print("Timeout error:", errt)
except requests.exceptions.RequestException as err:
    print("Something went wrong:", err)

๐Ÿง  Use .raise_for_status() to trigger exceptions for 4xx/5xx responses.


๐Ÿ”น 3.5 Custom Wrapper Function for Safe API Call

def safe_api_call(url):
    try:
        res = requests.get(url, timeout=5)
        res.raise_for_status()
        return res.json()
    except Exception as e:
        print(f"API failed: {e}")
        return None

This makes API handling reusable and robust.


✅ Module 4: Data Handling Errors in Pandas and NumPy


๐Ÿ”น 4.1 Common Errors in Pandas

Error Cause/Example
KeyError Accessing missing column/index
IndexError Accessing out-of-bound row index
ValueError Mismatch during assignment or reshaping
TypeError Operations on incompatible datatypes

๐Ÿ”น 4.2 Handling Missing Values (NaN, None)

import pandas as pd
import numpy as np

df = pd.DataFrame({
    "name": ["A", "B", np.nan],
    "score": [90, np.nan, 80]
})

# Detect missing
print(df.isnull())

# Drop rows with missing
df_clean = df.dropna()

# Fill missing with value
df_filled = df.fillna("Unknown")

๐Ÿง  Tip: Always check for NaN before doing aggregations.


๐Ÿ”น 4.3 Safely Accessing Columns

if "age" in df.columns:
    print(df["age"])
else:
    print("Column 'age' does not exist")

Use .get() for dictionaries and column-safe logic for DataFrames.


๐Ÿ”น 4.4 NumPy Array Shape and Type Errors

import numpy as np

arr = np.array([1, 2, 3])
try:
    arr.reshape(2, 2)
except ValueError as e:
    print("Reshape error:", e)

๐Ÿง  Always validate shape compatibility before reshaping or broadcasting.


๐Ÿ”น 4.5 Type Conversion and Casting Errors

try:
    df["score"] = df["score"].astype(int)
except ValueError as e:
    print("Type casting failed:", e)

Use pd.to_numeric(df["col"], errors="coerce") to handle bad conversions gracefully.


๐Ÿ”น 4.6 Try-Except Around Data Transformations

def safe_transform(df):
    try:
        df["score"] = df["score"].fillna(0).astype(int)
        return df
    except Exception as e:
        print("Data transformation error:", e)
        return df

๐Ÿ”น 4.7 Chained Indexing Warnings (Best Practice)

# Not recommended — may lead to SettingWithCopyWarning
df[df['name'] == 'A']['score'] = 95

# Recommended
df.loc[df['name'] == 'A', 'score'] = 95

๐Ÿง  Use .loc[] or .iloc[] for assignment to avoid ambiguous behavior.


๐Ÿ”น 4.8 Debugging Unexpected Results

Use print(df.dtypes) and df.head() before and after operations to trace silent failures (like NaNs).


๐Ÿงช Practice Tasks

  1. Load a CSV with some missing data and apply .fillna() safely.

  2. Try reshaping a NumPy array incorrectly and handle the error.

  3. Write a function that converts a column to numeric and handles type casting failures using to_numeric(..., errors="coerce").

  4. Attempt to access a missing DataFrame column and handle it gracefully.




✅ Module 5: Error Handling in Data Cleaning Pipelines


๐Ÿ”น 5.1 Parsing Errors in Data (Dates, Numbers, etc.)

import pandas as pd

data = pd.DataFrame({
    "date": ["2025-01-01", "not_a_date", "2024-12-31"]
})

# Handle with errors='coerce'
data["parsed_date"] = pd.to_datetime(data["date"], errors="coerce")

๐Ÿง  Tip: Always use errors="coerce" in to_datetime, to_numeric to avoid hard crashes.


๐Ÿ”น 5.2 Safe apply() Functions

Custom transformations inside apply() can break if data is dirty. Always use try-except inside them.

def safe_parse(x):
    try:
        return int(x)
    except:
        return None

df["converted"] = df["raw_col"].apply(safe_parse)

๐Ÿ”น 5.3 Skipping or Logging Bad Records

While looping over records (e.g., row-wise ops), handle bad rows with logging:

import logging

logging.basicConfig(filename="bad_rows.log", level=logging.ERROR)

def process_row(row):
    try:
        # risky transformation
        return row["a"] / row["b"]
    except Exception as e:
        logging.error(f"Row failed: {row.to_dict()} | Error: {e}")
        return None

df["result"] = df.apply(process_row, axis=1)

๐Ÿ”น 5.4 Handling Duplicates with Grace

try:
    df = df.drop_duplicates()
except Exception as e:
    print("Duplicate removal failed:", e)

๐Ÿ”น 5.5 Cleaning Pipelines with Function Wrappers

You can wrap multiple steps in a pipeline-like cleaning function with full safety:

def clean_data(df):
    try:
        df["price"] = pd.to_numeric(df["price"], errors="coerce")
        df["date"] = pd.to_datetime(df["date"], errors="coerce")
        df.dropna(subset=["price", "date"], inplace=True)
        return df
    except Exception as e:
        print("Cleaning pipeline failed:", e)
        return df

๐Ÿ”น 5.6 Error-Handled ETL Mini-Pipeline

def load_and_clean(path):
    try:
        df = pd.read_csv(path)
    except FileNotFoundError:
        print("File not found")
        return None

    try:
        df = clean_data(df)
    except Exception as e:
        print("Cleaning failed:", e)
    
    return df

๐Ÿง  This is the kind of pattern you’ll use in production ETL jobs and notebooks.


✅ Module 6: Error Handling in Machine Learning Pipelines


๐Ÿ”น 6.1 Handling Train-Test Split Issues

from sklearn.model_selection import train_test_split

try:
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y)
except ValueError as e:
    print("Train-test split failed:", e)

๐Ÿง  Common issues:

  • Mismatch in X and y lengths

  • Using stratify on target with too few classes


๐Ÿ”น 6.2 Catching Errors in Model Training

from sklearn.linear_model import LogisticRegression

model = LogisticRegression()

try:
    model.fit(X_train, y_train)
except ValueError as e:
    print("Model training error:", e)

๐Ÿง  Always check:

  • Input types (Pandas vs NumPy)

  • Missing values

  • Feature shapes


๐Ÿ”น 6.3 Handling Scikit-learn Pipeline Failures

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('clf', RandomForestClassifier())
])

try:
    pipeline.fit(X_train, y_train)
except Exception as e:
    print("Pipeline training failed:", e)

๐Ÿง  Watch for:

  • Mismatched input types

  • Missing values not handled before StandardScaler


๐Ÿ”น 6.4 Catching Hyperparameter Search Errors

from sklearn.model_selection import GridSearchCV

param_grid = {"n_estimators": [10, 50, 100]}
grid = GridSearchCV(RandomForestClassifier(), param_grid, cv=5)

try:
    grid.fit(X_train, y_train)
except Exception as e:
    print("GridSearchCV failed:", e)

๐Ÿง  Common mistakes:

  • Wrong param names

  • Empty grids

  • Invalid scoring functions


๐Ÿ”น 6.5 Handling Warnings like ConvergenceWarning

import warnings
from sklearn.exceptions import ConvergenceWarning

with warnings.catch_warnings():
    warnings.simplefilter("ignore", ConvergenceWarning)
    model.fit(X_train, y_train)

๐Ÿง  You can also choose to log or elevate warnings as errors in CI systems.


๐Ÿ”น 6.6 Saving and Loading Models with Care

import joblib

# Saving
try:
    joblib.dump(model, "model.pkl")
except Exception as e:
    print("Model save failed:", e)

# Loading
try:
    model = joblib.load("model.pkl")
except FileNotFoundError:
    print("Model file not found")

๐Ÿ”น 6.7 Wrap it in a Reusable Train Function

def train_model(X_train, y_train):
    try:
        model = RandomForestClassifier()
        model.fit(X_train, y_train)
        return model
    except Exception as e:
        print("Training failed:", e)
        return None


✅ Module 7: Debugging and Logging Techniques


๐Ÿ”น 7.1 Why Logging > Printing

Problem with print():

  • It disappears unless you're watching the console.

  • Doesn't work well in production or background jobs.

Logging Advantages:

  • Persistent (can save to files)

  • Different severity levels

  • Timestamped messages

  • Better traceability


๐Ÿ”น 7.2 Basic Logging Setup

import logging

logging.basicConfig(level=logging.INFO)
logging.info("Pipeline started")
logging.warning("This might be a problem")
logging.error("Something went wrong")

You can write logs to a file:

logging.basicConfig(filename='app.log', level=logging.DEBUG,
                    format='%(asctime)s - %(levelname)s - %(message)s')

๐Ÿง  Levels: DEBUG < INFO < WARNING < ERROR < CRITICAL


๐Ÿ”น 7.3 Logging Exceptions

try:
    1 / 0
except ZeroDivisionError as e:
    logging.exception("Division failed")

๐Ÿง  logging.exception automatically includes the full traceback.


๐Ÿ”น 7.4 Setting Up Module-Specific Logs

logger = logging.getLogger("DataCleaner")
logger.setLevel(logging.DEBUG)

logger.debug("Starting cleaning process")

Use different loggers for different pipeline stages.


๐Ÿ”น 7.5 Debugging with pdb (Python Debugger)

Start interactive debugging session:

import pdb

def buggy_function():
    x = 10
    y = 0
    pdb.set_trace()
    print(x / y)

buggy_function()

๐Ÿง  Inside pdb, use commands:

  • n: next line

  • c: continue

  • p var: print variable

  • q: quit


๐Ÿ”น 7.6 Using traceback for Custom Logs

import traceback

try:
    1 / 0
except Exception:
    print(traceback.format_exc())

๐Ÿง  Useful when you want custom error formatting instead of full crash.


๐Ÿ”น 7.7 Good Logging Practices

✅ Do:

  • Use logging in place of print in production

  • Log meaningful messages, not just "Error occurred"

  • Keep debug logs in development, warn/error logs in production

❌ Avoid:

  • Logging sensitive information (API keys, passwords)

  • Logging inside tight loops (may slow down)


๐Ÿ”น 7.8 Logging in Jupyter Notebooks

import logging

logging.basicConfig(level=logging.INFO, force=True)
logging.info("Notebook log works!")

๐Ÿง  Use force=True to reset configuration inside notebooks.


✅ Module 8: Testing for Failures in Python & Data Science Pipelines

Testing isn’t just about checking if your code works — it’s about verifying it fails gracefully when something goes wrong.


๐Ÿ”น 8.1 Why Test for Failures?

  • Ensures robustness

  • Prevents silent bugs in production

  • Helps future-proof your code

You’ll use either unittest (built-in) or pytest (popular in data teams).


๐Ÿ”น 8.2 Basic Test with unittest

import unittest

def divide(a, b):
    return a / b

class TestMathOps(unittest.TestCase):
    def test_divide_success(self):
        self.assertEqual(divide(10, 2), 5)

    def test_divide_by_zero(self):
        with self.assertRaises(ZeroDivisionError):
            divide(10, 0)

if __name__ == '__main__':
    unittest.main()

๐Ÿ”น 8.3 Using pytest (Simpler, Recommended)

Install it:

pip install pytest
# test_math.py

import pytest

def divide(a, b):
    return a / b

def test_divide_success():
    assert divide(10, 2) == 5

def test_divide_by_zero():
    with pytest.raises(ZeroDivisionError):
        divide(10, 0)

Run with:

pytest test_math.py

๐Ÿ”น 8.4 Test Error Handling in Data Functions

def convert_to_int(value):
    try:
        return int(value)
    except ValueError:
        return None
def test_convert_valid():
    assert convert_to_int("123") == 123

def test_convert_invalid():
    assert convert_to_int("abc") is None

๐Ÿ”น 8.5 Mocking Error Scenarios with unittest.mock

from unittest.mock import patch
import requests

def fetch_data(url):
    return requests.get(url).json()

@patch('requests.get')
def test_fetch_error(mock_get):
    mock_get.side_effect = requests.exceptions.RequestException
    with pytest.raises(requests.exceptions.RequestException):
        fetch_data("https://api.fake.com")

๐Ÿง  Use mocking to simulate:

  • API failures

  • File not found

  • Model loading errors


๐Ÿ”น 8.6 Testing Cleaning Functions with Edge Cases

def clean_age(x):
    try:
        age = int(x)
        if age < 0 or age > 120:
            return None
        return age
    except:
        return None
def test_clean_age_valid():
    assert clean_age("25") == 25

def test_clean_age_negative():
    assert clean_age("-5") is None

def test_clean_age_string():
    assert clean_age("abc") is None

๐Ÿ”น 8.7 Tips for Failure-Oriented Testing

✅ Do:

  • Write tests for expected failure cases

  • Use pytest.raises() or unittest's assertRaises

  • Test edge cases and dirty data

❌ Don’t:

  • Only test happy paths

  • Swallow errors silently



Comments

Popular posts from this blog

Resume Work and Project Details

Time Series and MMM basics

LINEAR REGRESSION