Error Handling
✅ Module 1: Basics of Error Handling in Python
๐น 1.1 Common Types of Errors in Python
These are runtime errors that commonly occur:
| Error Type | Description | Example |
|---|---|---|
SyntaxError |
Invalid Python syntax | if x = 5 (should be ==) |
NameError |
Variable not defined | print(x) when x is undefined |
TypeError |
Operation on incompatible types | "2" + 5 |
ValueError |
Function receives inappropriate value | int("hello") |
ZeroDivisionError |
Division by zero | 5 / 0 |
IndexError |
Index out of range | arr[10] |
๐น 1.2 Basic try-except Block
try:
result = 10 / 0
except ZeroDivisionError:
print("Cannot divide by zero!")
๐ง Explanation: This prevents the program from crashing by catching the ZeroDivisionError.
๐น 1.3 Catching Multiple Exceptions
try:
value = int("abc")
print(10 / 0)
except ValueError:
print("Conversion failed!")
except ZeroDivisionError:
print("Division by zero!")
You can also combine them:
try:
# some risky code
pass
except (ValueError, ZeroDivisionError) as e:
print(f"Error occurred: {e}")
๐น 1.4 Using else and finally
try:
result = 10 / 2
except ZeroDivisionError:
print("Division error!")
else:
print("No error occurred:", result)
finally:
print("This will always run.")
๐น 1.5 Raising Exceptions
You can manually raise errors:
def divide(a, b):
if b == 0:
raise ValueError("b cannot be zero.")
return a / b
print(divide(10, 2))
๐น 1.6 Creating Custom Exceptions
class MyCustomError(Exception):
pass
def check_value(x):
if x < 0:
raise MyCustomError("Negative values not allowed!")
check_value(-1)
✅ Module 2: Advanced Exception Handling
This module focuses on writing more robust, scalable, and clean error-handling code — useful in larger apps or production environments.
๐น 2.1 Nested Try-Except Blocks
You can have try-except blocks inside other try blocks.
try:
a = int(input("Enter number: "))
try:
result = 10 / a
print(result)
except ZeroDivisionError:
print("Inner: Division by zero")
except ValueError:
print("Outer: Invalid input")
๐ง Why: Helps isolate and handle errors in specific code sections.
๐น 2.2 Catching Multiple Exceptions with as
try:
# risky code
pass
except (TypeError, ValueError) as e:
print(f"An error occurred: {e}")
๐ง Tip: as e gives access to the original exception object for logging or debugging.
๐น 2.3 Exception Chaining
Useful to raise a new exception while preserving the original one.
try:
1 / 0
except ZeroDivisionError as e:
raise ValueError("Invalid math operation") from e
๐ง Why: Maintains traceback of both exceptions — useful for debugging.
๐น 2.4 Logging Exceptions (Instead of Printing)
import logging
logging.basicConfig(level=logging.ERROR)
try:
1 / 0
except ZeroDivisionError as e:
logging.error("Division error occurred", exc_info=True)
๐ง Why logging?
-
print()is fine for small scripts. -
loggingis ideal for production, debugging, and persistent logs.
๐น 2.5 Best Practices for Exception Handling
✅ Do:
-
Handle specific exceptions (
ValueError, not justException) -
Keep
tryblocks small -
Use logging instead of
print -
Document custom exceptions
-
Catch exceptions only when you can handle or report them
❌ Avoid:
-
Catching broad
Exceptionunless at the top level -
Swallowing errors silently
-
Overusing nested try blocks
✅ Module 3: Error Handling in File I/O and APIs
๐น 3.1 Handling File I/O Errors
When working with files, errors like missing files or permission issues are common.
try:
with open("data.csv", "r") as file:
content = file.read()
except FileNotFoundError:
print("The file does not exist.")
except PermissionError:
print("Permission denied.")
except Exception as e:
print(f"Unexpected error: {e}")
๐ง Best Practice: Use with open(...) to auto-close files.
๐น 3.2 Reading CSV/JSON Safely
import csv
try:
with open("data.csv", newline='') as f:
reader = csv.reader(f)
for row in reader:
print(row)
except Exception as e:
print("Error reading CSV:", e)
import json
try:
with open("data.json") as f:
data = json.load(f)
except json.JSONDecodeError:
print("JSON is malformed")
๐น 3.3 Writing Files with Care
try:
with open("output.txt", "w") as f:
f.write("Sample output")
except IOError as e:
print("Write error:", e)
๐ง Always handle IOError when writing files (e.g., disk full, permission denied).
๐น 3.4 Error Handling with APIs using requests
import requests
try:
response = requests.get("https://api.example.com/data", timeout=5)
response.raise_for_status() # Raises HTTPError for bad responses
data = response.json()
except requests.exceptions.HTTPError as errh:
print("HTTP error:", errh)
except requests.exceptions.ConnectionError as errc:
print("Connection error:", errc)
except requests.exceptions.Timeout as errt:
print("Timeout error:", errt)
except requests.exceptions.RequestException as err:
print("Something went wrong:", err)
๐ง Use .raise_for_status() to trigger exceptions for 4xx/5xx responses.
๐น 3.5 Custom Wrapper Function for Safe API Call
def safe_api_call(url):
try:
res = requests.get(url, timeout=5)
res.raise_for_status()
return res.json()
except Exception as e:
print(f"API failed: {e}")
return None
This makes API handling reusable and robust.
✅ Module 4: Data Handling Errors in Pandas and NumPy
๐น 4.1 Common Errors in Pandas
| Error | Cause/Example |
|---|---|
KeyError |
Accessing missing column/index |
IndexError |
Accessing out-of-bound row index |
ValueError |
Mismatch during assignment or reshaping |
TypeError |
Operations on incompatible datatypes |
๐น 4.2 Handling Missing Values (NaN, None)
import pandas as pd
import numpy as np
df = pd.DataFrame({
"name": ["A", "B", np.nan],
"score": [90, np.nan, 80]
})
# Detect missing
print(df.isnull())
# Drop rows with missing
df_clean = df.dropna()
# Fill missing with value
df_filled = df.fillna("Unknown")
๐ง Tip: Always check for NaN before doing aggregations.
๐น 4.3 Safely Accessing Columns
if "age" in df.columns:
print(df["age"])
else:
print("Column 'age' does not exist")
Use .get() for dictionaries and column-safe logic for DataFrames.
๐น 4.4 NumPy Array Shape and Type Errors
import numpy as np
arr = np.array([1, 2, 3])
try:
arr.reshape(2, 2)
except ValueError as e:
print("Reshape error:", e)
๐ง Always validate shape compatibility before reshaping or broadcasting.
๐น 4.5 Type Conversion and Casting Errors
try:
df["score"] = df["score"].astype(int)
except ValueError as e:
print("Type casting failed:", e)
Use pd.to_numeric(df["col"], errors="coerce") to handle bad conversions gracefully.
๐น 4.6 Try-Except Around Data Transformations
def safe_transform(df):
try:
df["score"] = df["score"].fillna(0).astype(int)
return df
except Exception as e:
print("Data transformation error:", e)
return df
๐น 4.7 Chained Indexing Warnings (Best Practice)
# Not recommended — may lead to SettingWithCopyWarning
df[df['name'] == 'A']['score'] = 95
# Recommended
df.loc[df['name'] == 'A', 'score'] = 95
๐ง Use .loc[] or .iloc[] for assignment to avoid ambiguous behavior.
๐น 4.8 Debugging Unexpected Results
Use print(df.dtypes) and df.head() before and after operations to trace silent failures (like NaNs).
๐งช Practice Tasks
-
Load a CSV with some missing data and apply
.fillna()safely. -
Try reshaping a NumPy array incorrectly and handle the error.
-
Write a function that converts a column to numeric and handles type casting failures using
to_numeric(..., errors="coerce"). -
Attempt to access a missing DataFrame column and handle it gracefully.
✅ Module 5: Error Handling in Data Cleaning Pipelines
๐น 5.1 Parsing Errors in Data (Dates, Numbers, etc.)
import pandas as pd
data = pd.DataFrame({
"date": ["2025-01-01", "not_a_date", "2024-12-31"]
})
# Handle with errors='coerce'
data["parsed_date"] = pd.to_datetime(data["date"], errors="coerce")
๐ง Tip: Always use errors="coerce" in to_datetime, to_numeric to avoid hard crashes.
๐น 5.2 Safe apply() Functions
Custom transformations inside apply() can break if data is dirty. Always use try-except inside them.
def safe_parse(x):
try:
return int(x)
except:
return None
df["converted"] = df["raw_col"].apply(safe_parse)
๐น 5.3 Skipping or Logging Bad Records
While looping over records (e.g., row-wise ops), handle bad rows with logging:
import logging
logging.basicConfig(filename="bad_rows.log", level=logging.ERROR)
def process_row(row):
try:
# risky transformation
return row["a"] / row["b"]
except Exception as e:
logging.error(f"Row failed: {row.to_dict()} | Error: {e}")
return None
df["result"] = df.apply(process_row, axis=1)
๐น 5.4 Handling Duplicates with Grace
try:
df = df.drop_duplicates()
except Exception as e:
print("Duplicate removal failed:", e)
๐น 5.5 Cleaning Pipelines with Function Wrappers
You can wrap multiple steps in a pipeline-like cleaning function with full safety:
def clean_data(df):
try:
df["price"] = pd.to_numeric(df["price"], errors="coerce")
df["date"] = pd.to_datetime(df["date"], errors="coerce")
df.dropna(subset=["price", "date"], inplace=True)
return df
except Exception as e:
print("Cleaning pipeline failed:", e)
return df
๐น 5.6 Error-Handled ETL Mini-Pipeline
def load_and_clean(path):
try:
df = pd.read_csv(path)
except FileNotFoundError:
print("File not found")
return None
try:
df = clean_data(df)
except Exception as e:
print("Cleaning failed:", e)
return df
๐ง This is the kind of pattern you’ll use in production ETL jobs and notebooks.
✅ Module 6: Error Handling in Machine Learning Pipelines
๐น 6.1 Handling Train-Test Split Issues
from sklearn.model_selection import train_test_split
try:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y)
except ValueError as e:
print("Train-test split failed:", e)
๐ง Common issues:
-
Mismatch in
Xandylengths -
Using
stratifyon target with too few classes
๐น 6.2 Catching Errors in Model Training
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
try:
model.fit(X_train, y_train)
except ValueError as e:
print("Model training error:", e)
๐ง Always check:
-
Input types (Pandas vs NumPy)
-
Missing values
-
Feature shapes
๐น 6.3 Handling Scikit-learn Pipeline Failures
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
pipeline = Pipeline([
('scaler', StandardScaler()),
('clf', RandomForestClassifier())
])
try:
pipeline.fit(X_train, y_train)
except Exception as e:
print("Pipeline training failed:", e)
๐ง Watch for:
-
Mismatched input types
-
Missing values not handled before
StandardScaler
๐น 6.4 Catching Hyperparameter Search Errors
from sklearn.model_selection import GridSearchCV
param_grid = {"n_estimators": [10, 50, 100]}
grid = GridSearchCV(RandomForestClassifier(), param_grid, cv=5)
try:
grid.fit(X_train, y_train)
except Exception as e:
print("GridSearchCV failed:", e)
๐ง Common mistakes:
-
Wrong param names
-
Empty grids
-
Invalid scoring functions
๐น 6.5 Handling Warnings like ConvergenceWarning
import warnings
from sklearn.exceptions import ConvergenceWarning
with warnings.catch_warnings():
warnings.simplefilter("ignore", ConvergenceWarning)
model.fit(X_train, y_train)
๐ง You can also choose to log or elevate warnings as errors in CI systems.
๐น 6.6 Saving and Loading Models with Care
import joblib
# Saving
try:
joblib.dump(model, "model.pkl")
except Exception as e:
print("Model save failed:", e)
# Loading
try:
model = joblib.load("model.pkl")
except FileNotFoundError:
print("Model file not found")
๐น 6.7 Wrap it in a Reusable Train Function
def train_model(X_train, y_train):
try:
model = RandomForestClassifier()
model.fit(X_train, y_train)
return model
except Exception as e:
print("Training failed:", e)
return None
✅ Module 7: Debugging and Logging Techniques
๐น 7.1 Why Logging > Printing
Problem with print():
-
It disappears unless you're watching the console.
-
Doesn't work well in production or background jobs.
Logging Advantages:
-
Persistent (can save to files)
-
Different severity levels
-
Timestamped messages
-
Better traceability
๐น 7.2 Basic Logging Setup
import logging
logging.basicConfig(level=logging.INFO)
logging.info("Pipeline started")
logging.warning("This might be a problem")
logging.error("Something went wrong")
You can write logs to a file:
logging.basicConfig(filename='app.log', level=logging.DEBUG,
format='%(asctime)s - %(levelname)s - %(message)s')
๐ง Levels: DEBUG < INFO < WARNING < ERROR < CRITICAL
๐น 7.3 Logging Exceptions
try:
1 / 0
except ZeroDivisionError as e:
logging.exception("Division failed")
๐ง logging.exception automatically includes the full traceback.
๐น 7.4 Setting Up Module-Specific Logs
logger = logging.getLogger("DataCleaner")
logger.setLevel(logging.DEBUG)
logger.debug("Starting cleaning process")
Use different loggers for different pipeline stages.
๐น 7.5 Debugging with pdb (Python Debugger)
Start interactive debugging session:
import pdb
def buggy_function():
x = 10
y = 0
pdb.set_trace()
print(x / y)
buggy_function()
๐ง Inside pdb, use commands:
-
n: next line -
c: continue -
p var: print variable -
q: quit
๐น 7.6 Using traceback for Custom Logs
import traceback
try:
1 / 0
except Exception:
print(traceback.format_exc())
๐ง Useful when you want custom error formatting instead of full crash.
๐น 7.7 Good Logging Practices
✅ Do:
-
Use logging in place of
printin production -
Log meaningful messages, not just "Error occurred"
-
Keep debug logs in development, warn/error logs in production
❌ Avoid:
-
Logging sensitive information (API keys, passwords)
-
Logging inside tight loops (may slow down)
๐น 7.8 Logging in Jupyter Notebooks
import logging
logging.basicConfig(level=logging.INFO, force=True)
logging.info("Notebook log works!")
๐ง Use force=True to reset configuration inside notebooks.
✅ Module 8: Testing for Failures in Python & Data Science Pipelines
Testing isn’t just about checking if your code works — it’s about verifying it fails gracefully when something goes wrong.
๐น 8.1 Why Test for Failures?
-
Ensures robustness
-
Prevents silent bugs in production
-
Helps future-proof your code
You’ll use either unittest (built-in) or pytest (popular in data teams).
๐น 8.2 Basic Test with unittest
import unittest
def divide(a, b):
return a / b
class TestMathOps(unittest.TestCase):
def test_divide_success(self):
self.assertEqual(divide(10, 2), 5)
def test_divide_by_zero(self):
with self.assertRaises(ZeroDivisionError):
divide(10, 0)
if __name__ == '__main__':
unittest.main()
๐น 8.3 Using pytest (Simpler, Recommended)
Install it:
pip install pytest
# test_math.py
import pytest
def divide(a, b):
return a / b
def test_divide_success():
assert divide(10, 2) == 5
def test_divide_by_zero():
with pytest.raises(ZeroDivisionError):
divide(10, 0)
Run with:
pytest test_math.py
๐น 8.4 Test Error Handling in Data Functions
def convert_to_int(value):
try:
return int(value)
except ValueError:
return None
def test_convert_valid():
assert convert_to_int("123") == 123
def test_convert_invalid():
assert convert_to_int("abc") is None
๐น 8.5 Mocking Error Scenarios with unittest.mock
from unittest.mock import patch
import requests
def fetch_data(url):
return requests.get(url).json()
@patch('requests.get')
def test_fetch_error(mock_get):
mock_get.side_effect = requests.exceptions.RequestException
with pytest.raises(requests.exceptions.RequestException):
fetch_data("https://api.fake.com")
๐ง Use mocking to simulate:
-
API failures
-
File not found
-
Model loading errors
๐น 8.6 Testing Cleaning Functions with Edge Cases
def clean_age(x):
try:
age = int(x)
if age < 0 or age > 120:
return None
return age
except:
return None
def test_clean_age_valid():
assert clean_age("25") == 25
def test_clean_age_negative():
assert clean_age("-5") is None
def test_clean_age_string():
assert clean_age("abc") is None
๐น 8.7 Tips for Failure-Oriented Testing
✅ Do:
-
Write tests for expected failure cases
-
Use
pytest.raises()orunittest'sassertRaises -
Test edge cases and dirty data
❌ Don’t:
-
Only test happy paths
-
Swallow errors silently
Comments
Post a Comment