PYTHON KEY POINTS
📘 Python Technical Notes (Cheat Sheet)
1. Python Basics
1.1 Data Types & Conversions
-
Check datatype:
type(variable) -
Check length:
len(variable) -
Case conversion:
variable.isupper() variable.islower() variable.upper() variable.lower() -
Type casting:
int1 = 500 str(int1) # "500" -
String formatting:
print("----{ }--------{ }".format(val1, val2))
2. Python Data Structures
🔹 1️⃣ LISTS
Declaration:
list1 = [] # empty list
list2 = [1, 2, 3, 'a'] # heterogeneous
list3 = list((10, 20, 30)) # using list() constructor
Properties:
-
✅ Ordered
-
✅ Mutable (modifiable)
-
✅ Allows duplicates
-
✅ Supports indexing & slicing
🔸 Common Operations:
| Operation | Description | Example |
|---|---|---|
append(x) |
Add single element at end | list1.append(5) |
extend(iterable) |
Add multiple elements | list1.extend([6,7]) |
insert(i, x) |
Insert at index i |
list1.insert(1, 99) |
remove(x) |
Remove first occurrence of x |
list1.remove(3) |
pop(i=-1) |
Remove and return element at index i |
list1.pop(2) |
clear() |
Remove all elements | list1.clear() |
index(x) |
Return index of first x |
list1.index(99) |
count(x) |
Count occurrences | list1.count(2) |
sort(reverse=False, key=None) |
Sort list | list1.sort() |
reverse() |
Reverse list | list1.reverse() |
copy() |
Shallow copy | new_list = list1.copy() |
🔸 Indexing & Slicing:
a = [10, 20, 30, 40, 50]
a[0] # 10
a[-1] # 50
a[1:4] # [20, 30, 40]
a[::-1] # [50, 40, 30, 20, 10]
🔸 Other Useful Operations:
len(a)
sum(a)
max(a)
min(a)
sorted(a)
🔸 List Comprehension:
squares = [x**2 for x in range(5)]
🔸 Unpacking:
x, y, z = [1, 2, 3]
🔸 Nested Lists:
matrix = [[1,2,3], [4,5,6]]
matrix[1][2] # 6
🔹 2️⃣ TUPLES
Declaration:
tup1 = () # empty tuple
tup2 = (1, 2, 3)
tup3 = tuple([10, 20])
tup4 = (1,) # single-element tuple (note comma)
Properties:
-
✅ Ordered
-
❌ Immutable
-
✅ Allows duplicates
-
✅ Supports indexing & slicing
🔸 Common Operations:
| Method / Operation | Description | Example |
|---|---|---|
count(x) |
Count occurrences | tup.count(2) |
index(x) |
Return index of x |
tup.index(10) |
len(tup) |
Get length | len(tup) |
max(tup) / min(tup) |
Get largest/smallest | max(tup) |
sum(tup) |
Sum (for numeric) | sum(tup) |
+ |
Concatenation | (1,2)+(3,4) |
* |
Repetition | (1,2)*3 |
in |
Membership | 2 in tup |
🔸 Tuple Unpacking:
a, b, c = (10, 20, 30)
🔸 Nested Tuples:
nested = ((1,2), (3,4))
nested[0][1] # 2
🔸 Conversion:
tuple([1, 2, 3])
list((4, 5, 6))
🔹 3️⃣ SETS
Declaration:
s1 = {1, 2, 3}
s2 = set([4, 5, 6])
empty_set = set() # {} creates dictionary, not set!
Properties:
-
❌ Unordered
-
❌ Unindexed
-
✅ Mutable (elements can be added/removed)
-
❌ No duplicates
🔸 Common Methods:
| Method | Description | Example |
|---|---|---|
add(x) |
Add single element | s1.add(4) |
update(iterable) |
Add multiple | s1.update([5,6]) |
remove(x) |
Remove element, raises error if not found | s1.remove(2) |
discard(x) |
Remove if present (no error) | s1.discard(10) |
pop() |
Remove random element | s1.pop() |
clear() |
Remove all | s1.clear() |
copy() |
Shallow copy | s2 = s1.copy() |
🔸 Set Operations:
| Operation | Example | Result |
|---|---|---|
| Union | `s1 | s2ors1.union(s2)` |
| Intersection | s1 & s2 or s1.intersection(s2) |
Common elements |
| Difference | s1 - s2 |
Elements in s1 not in s2 |
| Symmetric Difference | s1 ^ s2 |
Elements not in both |
| Subset | s1.issubset(s2) |
True/False |
| Superset | s1.issuperset(s2) |
True/False |
| Disjoint | s1.isdisjoint(s2) |
True/False |
🔸 Frozenset:
Immutable version of set.
fs = frozenset([1,2,3])
🔹 4️⃣ DICTIONARIES
Declaration:
dict1 = {} # empty
dict2 = {'a':1, 'b':2}
dict3 = dict(name="John", age=25)
dict4 = dict([('x',10), ('y',20)])
Properties:
-
✅ Key–Value pairs
-
✅ Mutable
-
❌ No duplicate keys (last value retained)
-
✅ Keys must be immutable (e.g., str, int, tuple)
🔸 Access & Update:
d = {'a':1, 'b':2}
d['a'] # 1
d.get('b') # 2
d['c'] = 3 # add new
d['a'] = 10 # update
🔸 Common Methods:
| Method | Description | Example |
|---|---|---|
keys() |
Returns all keys | d.keys() |
values() |
Returns all values | d.values() |
items() |
Returns key-value pairs | d.items() |
get(key, default) |
Safe access | d.get('x', 0) |
pop(key, default) |
Remove key and return value | d.pop('a') |
popitem() |
Removes last inserted pair | d.popitem() |
update(dict2) |
Merge another dictionary | d.update({'x': 9}) |
clear() |
Remove all items | d.clear() |
copy() |
Shallow copy | d2 = d.copy() |
setdefault(key, default) |
If key missing, add with default | d.setdefault('z', 5) |
🔸 Iteration:
for k in d.keys(): print(k)
for v in d.values(): print(v)
for k,v in d.items(): print(k, v)
🔸 Dictionary Comprehension:
squares = {x: x**2 for x in range(5)}
🔸 Nested Dictionaries:
students = {
'John': {'age': 25, 'grade': 'A'},
'Sara': {'age': 22, 'grade': 'B'}
}
print(students['John']['grade']) # A
🔹 SUMMARY TABLE
| Data Structure | Ordered | Mutable | Allows Duplicates | Syntax Example |
|---|---|---|---|---|
| List | ✅ | ✅ | ✅ | [1,2,3] |
| Tuple | ✅ | ❌ | ✅ | (1,2,3) |
| Set | ❌ | ✅ | ❌ | {1,2,3} |
| Dictionary | ✅ (from Py3.7+) | ✅ | ❌ (keys) | {'a':1, 'b':2} |
3. Comprehensions
-
List:
[x for x in variable if condition] -
Dictionary:
{x: x+2 for x in variable if condition} -
Set:
{x for x in variable if condition}
4. Functions & Errors
4.1 Function Structure
def func_name(args):
try:
...
except Exception:
...
else:
...
finally:
...
4.2 Error Types
-
Syntax Errors
-
Logical / Semantic Errors
-
Runtime Errors (Exceptions)
4.3 Common Exceptions
-
IndexError
-
ModuleNotFoundError
-
KeyError
-
ImportError
-
TypeError
-
ValueError
-
NameError
-
ZeroDivisionError
5. Lambda & Functional Programming
-
Conditional lambda:
f = lambda x, y: x if x > y else y -
Map:
list(map(lambda x: x**2, [1,2,3])) -
Filter:
list(filter(lambda x: x%2==0, [1,2,3,4])) -
Reduce:
from functools import reduce reduce(lambda x, y: x+y, [1,2,3])
6. NumPy Basics
6.1 Array Creation
import numpy as np
x = np.array([...])
x.ndim
np.arange(1,10,2)
np.zeros((3,3))
np.random.randint(1,10, size=(2,3))
6.2 Indexing & Slicing
arr[2]
arr[1:4]
6.3 Operations
np.sum(arr, axis=0)
np.mod(arr, 2)
np.remainder(arr, 3)
np.divide(arr, 2)
np.multiply(arr1, arr2)
np.matmul(arr1, arr2)
np.add(arr1, arr2)
np.subtract(arr1, arr2)
np.max(arr)
np.min(arr)
6.4 Shape & Transform
np.arange(9).reshape(3,3)
np.array_split(arr, 3)
np.hsplit(arr, 2)
np.concatenate([arr1, arr2])
np.repeat(arr, 2)
7. Pandas Basics
7.1 Series
import pandas as pd
s = pd.Series(data=[1,2,3], index=['a','b','c'])
s.describe()
s.count()
s.sum()
s.mean()
s.median()
s.mode()
s.apply(lambda x: x*2)
7.2 DataFrame
df = pd.DataFrame({...})
pd.read_csv("file.csv")
pd.read_excel("file.xlsx")
-
Access:
df.loc[row_label] df.iloc[row_num] df.columns df.dtypes -
Iteration:
df.iterrows() df.itertuples() df.items() -
Descriptive:
df.describe() df.shape df.info() df.head() df.tail()
7.3 Cleaning & Transformation
df.isnull().sum()
df.drop_duplicates(inplace=True)
df.replace(val1, val2, inplace=True)
df.groupby('col')['value'].mean()
df.sort_values(by='col')
df.reset_index()
7.4 Pivot & Aggregation
pd.pivot_table(df, index=['col1'], columns=['col2'], values=['val'])
df.agg({'col': ['min', 'max', 'mean']})
7.5 Export
df.to_csv("out.csv")
df.to_json("out.json")
df.to_html("out.html")
df.to_pickle("out.pkl")
8. Statistics in Python
-
Five-point summary: min, Q1, Q2, Q3, max
-
IQR = Q3 – Q1
-
Range = max – min
-
Variance:
x.var() -
Std Dev:
x.std() -
Covariance:
x.cov() -
Correlation:
x.corr() -
Skewness:
x.skew()
9. Visualization
Univariate
-
Histogram:
plt.hist(x, bins=10) -
Distplot:
sns.distplot(x) -
Violinplot:
sns.violinplot(x)
Multivariate
-
Pairplot:
sns.pairplot(df) -
Scatterplot:
sns.scatterplot(x, y) -
Heatmap:
sns.heatmap(df.corr(), annot=True) -
Pie chart:
plt.pie(values) -
Boxplot:
plt.boxplot(x) -
Countplot:
sns.countplot(x) -
Stripplot/Swarmplot
10. Data Preprocessing
10.1 Handling Missing Values
-
Remove unwanted chars
-
Impute with mean/median/KNN
-
Log transformation for skewness
10.2 Outlier Handling
-
IQR method
-
Log transformation
-
Capping
10.3 Encoding
-
Label Encoding:
from sklearn.preprocessing import LabelEncoder le = LabelEncoder() df['col'] = le.fit_transform(df['col']) -
OneHot Encoding:
from sklearn.preprocessing import OneHotEncoder enc = OneHotEncoder() enc.fit_transform(df[['col']])
10.4 Scaling
-
StandardScaler
-
MinMaxScaler
-
Log / Exponential transformation
🧠1️⃣ Statsmodels Cheat Sheet (for Statistical Modeling & Inference)
🔹 Importing
🔹 Core Purpose
Used for statistical modeling, hypothesis testing, and inference (unlike sklearn which focuses on prediction).
🔹 Regression Models
👉 Linear Regression (OLS)
👉 Formula API (like R-style)
🔹 Logistic Regression
🔹 Time Series
🔹 Hypothesis Testing
Other tests:
-
ANOVA →
sm.stats.anova_lm(model, typ=2) -
Durbin-Watson (autocorrelation) →
sm.stats.durbin_watson(model.resid) -
Jarque-Bera (normality) →
sm.stats.jarque_bera(model.resid) -
Breusch-Pagan (heteroscedasticity) →
sm.stats.diagnostic.het_breuschpagan(...)
🔹 Diagnostic Plots
🔹 Key Strengths
✅ Detailed statistical summary
✅ Hypothesis testing & p-values
✅ Time series & econometrics support
✅ Formula-based modeling (like R)
⚙️ 2️⃣ Scikit-Learn (sklearn) Cheat Sheet
🔹 Import Core
🔹 Workflow
1️⃣ Split Data
2️⃣ Scale Data
3️⃣ Train Model
4️⃣ Predict
5️⃣ Evaluate
🔹 Key Modules
| Task | Common Classes |
|---|---|
| Preprocessing | StandardScaler, LabelEncoder, OneHotEncoder, MinMaxScaler |
| Regression | LinearRegression, Ridge, Lasso, ElasticNet, SVR, RandomForestRegressor |
| Classification | LogisticRegression, KNeighborsClassifier, SVC, RandomForestClassifier, XGBClassifier |
| Clustering | KMeans, DBSCAN, AgglomerativeClustering |
| Dimensionality Reduction | PCA, TruncatedSVD, TSNE |
| Model Selection | GridSearchCV, RandomizedSearchCV, cross_val_score |
| Metrics | r2_score, mean_squared_error, accuracy_score, precision_score, recall_score, f1_score, roc_auc_score |
🔹 Pipelines
🔹 Save/Load Model
🔹 Key Strengths
✅ Predictive modeling
✅ Feature engineering tools
✅ Pipelines and GridSearchCV
✅ Works seamlessly with numpy/pandas
🤖 3️⃣ Keras Cheat Sheet (for Deep Learning)
🔹 Import
🔹 Sequential Model
🔹 Compile
🔹 Fit
🔹 Evaluate & Predict
🔹 Save/Load Model
🔹 Common Layers
| Layer | Description |
|---|---|
Dense | Fully connected layer |
Dropout | Prevents overfitting |
Flatten | Converts 2D → 1D |
LSTM, GRU | Sequence modeling |
Conv2D, MaxPooling2D | CNN layers for image tasks |
BatchNormalization | Normalizes activations |
🔹 Activation Functions
| Name | Usage |
|---|---|
relu | Hidden layers |
sigmoid | Binary classification |
softmax | Multi-class output |
tanh | LSTM layers |
🔹 Optimizers
| Optimizer | Notes |
|---|---|
adam | Most commonly used, adaptive |
sgd | Vanilla stochastic gradient |
rmsprop | Recurrent models (LSTM/GRU) |
🔹 Loss Functions
| Problem | Loss |
|---|---|
| Regression | mse, mae |
| Binary Classification | binary_crossentropy |
| Multi-class Classification | categorical_crossentropy |
🔹 Visualize Training
🔹 Key Strengths
✅ Easy model building
✅ Works seamlessly with TensorFlow
✅ Fast prototyping
✅ Supports CNNs, RNNs, Transformers
🚀 Summary Table
| Library | Focus | Strength |
|---|---|---|
| Statsmodels | Statistical analysis, inference | p-values, confidence intervals |
| Sklearn | Predictive ML | Preprocessing, pipelines, metrics |
| Keras | Deep Learning | Neural networks, easy to build |
⚡ TensorFlow Cheat Sheet
🧩 1️⃣ Core Concepts
🔹 Import TensorFlow
TensorFlow is an end-to-end framework for building, training, and deploying machine learning models — from linear regression → deep neural networks → production deployment.
🧠2️⃣ Tensors (Core Data Structure)
🔹 Creating Tensors
🔹 Tensor Operations
🔹 Converting Between NumPy and Tensor
⚙️ 3️⃣ Basic Workflow (Model Pipeline)
1️⃣ Prepare Data →
2️⃣ Build Model →
3️⃣ Compile →
4️⃣ Train →
5️⃣ Evaluate →
6️⃣ Predict / Save
🧰 4️⃣ Dataset Handling
🔹 TensorFlow Dataset API
Use Case: Efficient data pipelines for large datasets.
🧱 5️⃣ Building Neural Networks
🔹 Sequential Model
⚗️ 6️⃣ Compiling the Model
🚀 7️⃣ Training the Model
📈 8️⃣ Evaluate & Predict
💾 9️⃣ Save / Load Model
📊 🔟 Visualization
🧮 11️⃣ Common Layers in TensorFlow / Keras
| Layer | Purpose |
|---|---|
Dense() | Fully connected layer |
Dropout() | Prevents overfitting |
Flatten() | Convert 2D → 1D |
Conv2D() | Convolutional layer for images |
MaxPooling2D() | Downsampling |
LSTM(), GRU() | Sequence / time-series |
BatchNormalization() | Stabilizes training |
🧠12️⃣ Common Activation Functions
| Function | Use Case |
|---|---|
relu | Hidden layers |
sigmoid | Binary output |
softmax | Multi-class |
tanh | Recurrent networks |
🔧 13️⃣ Optimizers
| Optimizer | Description |
|---|---|
adam | Adaptive, default choice |
sgd | Simple gradient descent |
rmsprop | Good for RNNs |
adagrad | Sparse data optimization |
🧩 14️⃣ Loss Functions
| Task | Loss Function |
|---|---|
| Regression | mse, mae |
| Binary Classification | binary_crossentropy |
| Multi-class Classification | categorical_crossentropy |
| Custom | Define using tf.keras.losses.Loss |
🧰 15️⃣ Callbacks (During Training)
🧮 16️⃣ TensorFlow Math & Gradient Operations
🔹 Auto-Differentiation
🧰 17️⃣ Transfer Learning (Example)
☁️ 18️⃣ TensorFlow Extended / Serving / Lite
| Module | Purpose |
|---|---|
| TFX | Production ML pipelines |
| TF Lite | Mobile / Edge deployment |
| TF Serving | Model serving via API |
| TF Hub | Pretrained model repository |
🧩 19️⃣ TensorFlow Useful Shortcuts
🚀 20️⃣ TensorFlow vs Keras Quick Difference
| Aspect | TensorFlow | Keras |
|---|---|---|
| Level | Low-level, flexible | High-level, easy |
| Use Case | Custom training loops, fine control | Quick prototyping |
| Integration | Keras now part of TensorFlow (tf.keras) | Unified since TF 2.x |
✅ 21️⃣ Common Interview-Ready Points
-
TensorFlow uses computational graphs to execute efficiently on CPUs, GPUs, TPUs.
-
Supports eager execution (run immediately like Python).
-
Keras API is now the official high-level interface of TensorFlow.
-
TensorFlow datasets (
tf.data) allow parallelized + streamed data feeding. -
You can write custom loss functions, custom layers, and custom training loops using
tf.GradientTape.
🔥 PyTorch Cheat Sheet (For Data Science & Deep Learning)
🧩 1️⃣ Import & Setup
🔹 Check Device (CPU / GPU)
🧮 2️⃣ Tensors (Core Data Structure)
🔹 Creating Tensors
🔹 Tensor Operations
🔹 Converting Between NumPy and Torch
🔹 GPU Usage
⚙️ 3️⃣ Gradient & Autograd (Automatic Differentiation)
🔹 Enable Gradient Tracking
🧠4️⃣ Building Neural Networks
🔹 Using nn.Module
⚗️ 5️⃣ Instantiate Model & Move to GPU
🧮 6️⃣ Loss & Optimizer
🚀 7️⃣ Training Loop
📈 8️⃣ Evaluation
💾 9️⃣ Save / Load Model
🧰 🔟 Datasets & Dataloaders
🔹 TensorDataset + DataLoader
🔹 Iterate
🧱 11️⃣ Common Layers
| Layer | Description |
|---|---|
nn.Linear() | Fully connected layer |
nn.Conv2d() | 2D convolution (images) |
nn.MaxPool2d() | Downsampling |
nn.ReLU() | Activation |
nn.Dropout() | Regularization |
nn.BatchNorm2d() | Normalize activations |
nn.LSTM() | Sequence model (RNN) |
nn.Embedding() | Word embeddings |
⚡ 12️⃣ Common Activation Functions
| Function | Use |
|---|---|
F.relu(x) | Hidden layers |
torch.sigmoid(x) | Binary output |
F.softmax(x, dim=1) | Multi-class |
torch.tanh(x) | RNNs |
🧮 13️⃣ Loss Functions
| Task | Loss Function |
|---|---|
| Regression | nn.MSELoss() |
| Binary Classification | nn.BCELoss() or nn.BCEWithLogitsLoss() |
| Multi-class | nn.CrossEntropyLoss() |
⚙️ 14️⃣ Optimizers
| Optimizer | Import |
|---|---|
| SGD | optim.SGD(model.parameters(), lr=0.01) |
| Adam | optim.Adam(model.parameters(), lr=0.001) |
| RMSprop | optim.RMSprop(model.parameters(), lr=0.001) |
📊 15️⃣ Learning Rate Scheduler
🧩 16️⃣ TorchVision (for Images)
🔹 Import
🔹 Transformations
🔹 Pretrained Model (Transfer Learning)
🧠17️⃣ Custom Dataset Class
🧮 18️⃣ GPU Acceleration Example
📈 19️⃣ Plot Training History
You can record losses manually in a list:
🧠20️⃣ Interview-Level Key Points
✅ PyTorch uses dynamic computation graphs (Define-by-Run) — built during execution (vs TensorFlow static graph).
✅ torch.autograd handles automatic differentiation.
✅ torch.nn is the neural network module.
✅ torch.utils.data simplifies data loading.
✅ torchvision & torchaudio handle vision/audio datasets.
✅ PyTorch is preferred for research, experimentation, and flexibility.
✅ Models can easily be exported to ONNX for deployment.
🧩 21️⃣ Comparing PyTorch vs TensorFlow
| Feature | PyTorch | TensorFlow |
|---|---|---|
| Graph Type | Dynamic (define-by-run) | Static (define-then-run) |
| Ease of Debugging | Easier | Slightly complex |
| Community | Research | Production |
| High-Level API | torch.nn, Lightning | tf.keras |
| Deployment | TorchServe / ONNX | TF Serving / Lite |
🧠22️⃣ Useful Utilities
Comments
Post a Comment