TensorFlow – Beginner-Friendly Complete Notes

1. Introduction to TensorFlow

What is TensorFlow?
- Open-source machine learning framework by Google.
- Used for deep learning, ML, and numerical computation.
- Works on CPU, GPU, TPU.
Key Features:
- Easy model building (Keras high-level API).
- Runs on multiple devices.
- Large ecosystem (TensorBoard, TFLite, TF Serving).

2. Installation

pip install tensorflow

Check version:

import tensorflow as tf
print(tf.__version__)

3. Basic Building Blocks

a) Tensors

Tensors = multi-dimensional arrays (like NumPy but GPU-friendly).

x = tf.constant([[1,2],[3,4]])
print(x)  # 2D tensor

Tensor Ranks:
- Scalar (0D), Vector (1D), Matrix (2D), Higher dimensions.

b) Variables

Trainable tensors, used to store weights.

w = tf.Variable([0.5, 1.0])

c) Operations

Math ops on tensors.

a = tf.constant([1,2,3])
b = tf.constant([4,5,6])
print(tf.add(a,b))   # [5 7 9]

4. TensorFlow vs NumPy

NumPy: CPU only, no automatic differentiation.
TensorFlow: Works on GPU, supports automatic differentiation.

import numpy as np
np_arr = np.array([1,2,3])
tf_tensor = tf.convert_to_tensor(np_arr)

5. TensorFlow Workflow (Step by Step)

Step 1: Import Data

Built-in datasets in tf.keras.datasets.

from tensorflow.keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Step 2: Preprocess Data

Normalize and reshape.

x_train = x_train / 255.0
x_test = x_test / 255.0

Step 3: Build Model

Use Sequential API (simple stack of layers).

from tensorflow.keras import models, layers

model = models.Sequential([
    layers.Flatten(input_shape=(28,28)),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

Step 4: Compile Model

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Step 5: Train Model

model.fit(x_train, y_train, epochs=5, validation_data=(x_test, y_test))

Step 6: Evaluate

model.evaluate(x_test, y_test)

6. TensorFlow Model APIs

a) Sequential API

Linear stack of layers.
Best for simple models.

b) Functional API

More flexible (multi-input/output, non-linear graphs).

inputs = layers.Input(shape=(28,28))
x = layers.Flatten()(inputs)
x = layers.Dense(64, activation='relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)
model = models.Model(inputs, outputs)

c) Subclassing API

Full control using Python classes.

7. Common Layers

Dense – Fully connected.
Conv2D, MaxPooling2D – For images.
LSTM, GRU – For sequences/text.
Dropout – Prevent overfitting.
BatchNormalization – Normalize activations.

8. Training Essentials

Optimizers

SGD – Simple gradient descent.
Adam – Most common, adaptive learning rate.

Loss Functions

Regression → mse
Classification → binary_crossentropy, categorical_crossentropy.

Metrics

Accuracy, Precision, Recall, F1.

9. Callbacks

Add functionality during training.

from tensorflow.keras.callbacks import EarlyStopping
cb = EarlyStopping(patience=3, restore_best_weights=True)
model.fit(x_train, y_train, epochs=20, callbacks=[cb])

Common Callbacks:
- EarlyStopping
- ModelCheckpoint
- TensorBoard

10. Saving and Loading Models

# Save
model.save("my_model.h5")

# Load
from tensorflow.keras.models import load_model
model = load_model("my_model.h5")

11. TensorBoard (Visualization)

Tool to visualize training (loss, accuracy, graphs).

tensorboard --logdir=logs/

12. TensorFlow Ecosystem

TensorFlow Lite (TFLite) → For mobile/IoT.
TensorFlow.js → For running ML in browser.
TF Serving → For deployment.
TF Hub → Pre-trained models.

13. Example: End-to-End Classification

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import fashion_mnist

# Load data
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train/255.0, x_test/255.0

# Build model
model = models.Sequential([
    layers.Flatten(input_shape=(28,28)),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(10, activation='softmax')
])

# Compile
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train
model.fit(x_train, y_train, epochs=5, validation_data=(x_test, y_test))

# Evaluate
print(model.evaluate(x_test, y_test))

14. Tips for Beginners

Start with Sequential API before Functional.
Use callbacks to avoid overfitting.
Normalize data always.
Experiment with different optimizers and learning rates.
Visualize results with TensorBoard.

15. Interview Quick Recap

TensorFlow = ML framework by Google.
Tensors = multidimensional arrays.
APIs: Sequential, Functional, Subclassing.
Common optimizers: SGD, Adam.
Loss: MSE (regression), CrossEntropy (classification).
Ecosystem: TF Lite, TF.js, TF Hub, TensorBoard.

PyTorch – Beginner-Friendly Complete Notes

1. Introduction to PyTorch

What is PyTorch?
- Open-source deep learning framework by Facebook (Meta).
- Flexible, pythonic, widely used in research.
- Supports CPU & GPU.
Key Features:
- Dynamic computation graph (eager execution).
- Strong community for research + production.
- Integration with NumPy and Python libraries.

2. Installation

pip install torch torchvision torchaudio

Check version:

import torch
print(torch.__version__)

3. Core Building Blocks

a) Tensors

Like NumPy arrays, but with GPU acceleration.

import torch
x = torch.tensor([[1, 2], [3, 4]])
print(x)

Check device:

print(x.device)   # cpu by default

Move to GPU (if available):

if torch.cuda.is_available():
    x = x.to("cuda")

b) Operations

a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])
print(a + b)  # tensor([5, 7, 9])

c) Autograd (Automatic Differentiation)

PyTorch tracks gradients for optimization.

w = torch.tensor(2.0, requires_grad=True)
y = w**2
y.backward()
print(w.grad)  # dy/dw = 4

4. PyTorch vs TensorFlow

PyTorch: Dynamic graph (easy debugging, flexible).
TensorFlow: Static + Eager (optimized for deployment).
PyTorch is favored for research, TensorFlow more for production.

5. Workflow in PyTorch

Step 1: Dataset

Torch has dataset utilities in torchvision.datasets.

from torchvision import datasets, transforms

transform = transforms.ToTensor()
train_data = datasets.MNIST(root="data", train=True, transform=transform, download=True)

Step 2: DataLoader

Helps in batching & shuffling data.

from torch.utils.data import DataLoader

train_loader = DataLoader(train_data, batch_size=64, shuffle=True)

Step 3: Define Model

Using nn.Module (base class for all models).

import torch.nn as nn
import torch.nn.functional as F

class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(28*28, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = x.view(-1, 28*28)      # flatten
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

Step 4: Define Loss and Optimizer

model = SimpleNN()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

Step 5: Training Loop

for epoch in range(5):
    for images, labels in train_loader:
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward pass
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")

Step 6: Evaluation

correct, total = 0, 0
with torch.no_grad():  # no gradient calculation
    for images, labels in train_loader:
        outputs = model(images)
        _, preds = torch.max(outputs, 1)
        correct += (preds == labels).sum().item()
        total += labels.size(0)

print("Accuracy:", correct / total)

6. PyTorch Components

a) Tensors

torch.tensor(), torch.zeros(), torch.ones(), torch.rand().

b) Autograd

requires_grad=True tracks gradients.
tensor.backward() computes gradients.

c) Optimizers

torch.optim.SGD, torch.optim.Adam, torch.optim.RMSprop.

d) Loss Functions

nn.MSELoss() → Regression.
nn.CrossEntropyLoss() → Classification.

e) Modules & Layers

nn.Linear → Fully connected.
nn.Conv2d, nn.MaxPool2d → CNN layers.
nn.LSTM, nn.GRU → RNN layers.
nn.Dropout, nn.BatchNorm2d.

7. GPU/Device Management

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

Move data:

images, labels = images.to(device), labels.to(device)

8. Saving and Loading Models

# Save
torch.save(model.state_dict(), "model.pth")

# Load
model = SimpleNN()
model.load_state_dict(torch.load("model.pth"))
model.eval()

9. PyTorch Ecosystem

torchvision → Computer vision datasets & models.
torchaudio → Audio ML.
torchtext → NLP datasets & models.
PyTorch Lightning → High-level training framework.
ONNX → Export models for deployment.

10. Example: End-to-End Classification

import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Data
transform = transforms.ToTensor()
train_data = datasets.FashionMNIST(root="data", train=True, transform=transform, download=True)
train_loader = DataLoader(train_data, batch_size=64, shuffle=True)

# Model
class FashionNN(nn.Module):
    def __init__(self):
        super(FashionNN, self).__init__()
        self.fc1 = nn.Linear(28*28, 128)
        self.fc2 = nn.Linear(128, 10)
    def forward(self, x):
        x = x.view(-1, 28*28)
        x = F.relu(self.fc1(x))
        return self.fc2(x)

model = FashionNN()

# Loss + Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Training
for epoch in range(3):
    for images, labels in train_loader:
        outputs = model(images)
        loss = criterion(outputs, labels)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch+1}, Loss={loss.item():.4f}")

11. Tips for Beginners

Always check .shape of tensors.
Remember to flatten images before feeding into fully connected layers.
Use with torch.no_grad() during evaluation.
Use .to(device) to move model/data to GPU.
Start with simple models, then try CNNs/RNNs.

12. Interview Quick Recap

PyTorch = deep learning framework by Meta.
Tensors = multi-dimensional arrays with GPU support.
Autograd = automatic differentiation.
Models → subclass nn.Module.
Optimizers → SGD, Adam.
Loss → MSE (regression), CrossEntropy (classification).
Ecosystem → torchvision, torchaudio, torchtext, PyTorch Lightning.

MongoDB Complete Notes (Beginner Friendly)

1. What is MongoDB?

MongoDB is an open-source NoSQL database that stores data in JSON-like documents (called BSON = Binary JSON).
Unlike relational databases (SQL), MongoDB does not require predefined schemas.
It's highly scalable, flexible, and works perfectly for modern data apps, APIs, and ML pipelines.

2. Key Features

✅ NoSQL – document-oriented
✅ Schema-less – flexible data model
✅ High performance – fast read/write
✅ Scalable – supports sharding & replication
✅ JSON-style documents
✅ Powerful query language
✅ Integration with Python, Flask, Django, etc.

3. Basic Concepts

Term	Description	SQL Equivalent
Database	Group of collections	Database
Collection	Group of documents	Table
Document	JSON-like data record	Row
Field	Key-value pair in document	Column
_id	Unique identifier for each document	Primary key

4. MongoDB Architecture Overview

+-------------------------------------------------+
|                     MongoDB                     |
|-------------------------------------------------|
| Database → Collection → Document (JSON format)  |
| Example:                                        |
| db.users.insertOne({name: "Sanjay", age: 28})  |
+-------------------------------------------------+

5. Installation

🧩 Option 1: Local Setup

Download from https://www.mongodb.com/try/download/community
Start MongoDB service:
```
mongod
```
Open Mongo Shell:
```
mongosh
```

🧩 Option 2: MongoDB Atlas (Cloud)

Go to https://cloud.mongodb.com
Create cluster → Connect → Copy connection string (e.g.)
```
mongodb+srv://username:password@cluster.mongodb.net/test
```

6. MongoDB Data Format Example

{
  "_id": 1,
  "name": "John",
  "age": 30,
  "skills": ["Python", "ML", "Flask"],
  "address": { "city": "Bangalore", "pincode": 560001 }
}

✅ Nested JSON
✅ Arrays supported
✅ Flexible structure

7. Basic MongoDB Commands

Command	Description
`show dbs`	List all databases
`use mydb`	Switch/create database
`show collections`	List collections
`db.createCollection("users")`	Create a collection
`db.users.insertOne({...})`	Insert single document
`db.users.insertMany([...])`	Insert multiple documents
`db.users.find()`	View all documents
`db.users.findOne()`	View first document
`db.users.updateOne()`	Update single record
`db.users.deleteOne()`	Delete single record
`db.dropDatabase()`	Delete database

8. CRUD Operations

➤ Create

db.students.insertOne({
  name: "Amit",
  age: 22,
  course: "Data Science"
})

➤ Read

db.students.find()
db.students.find({age: {$gt: 20}})
db.students.find({course: "Data Science"}, {name: 1, _id: 0})

➤ Update

db.students.updateOne(
  { name: "Amit" },
  { $set: { age: 23 } }
)

➤ Delete

db.students.deleteOne({ name: "Amit" })

9. Query Operators

Operator	Meaning	Example
`$gt`	Greater than	`{age: {$gt: 25}}`
`$lt`	Less than	`{age: {$lt: 25}}`
`$eq`	Equal	`{age: {$eq: 30}}`
`$ne`	Not equal	`{age: {$ne: 25}}`
`$in`	In list	`{city: {$in: ["Delhi","Mumbai"]}}`
`$and`	Logical AND	`{$and:[{age:{$gt:20}},{city:"Pune"}]}`
`$or`	Logical OR	`{$or:[{age:{$lt:20}},{city:"Delhi"}]}`

10. Indexing

Used to speed up queries.

db.users.createIndex({name: 1})
db.users.getIndexes()

11. Aggregation Framework

Aggregation = data processing pipelines (like SQL GROUP BY).

Example:

db.sales.aggregate([
  { $match: { region: "Asia" } },
  { $group: { _id: "$country", totalSales: { $sum: "$amount" } } },
  { $sort: { totalSales: -1 } }
])

12. Connection with Python (`pymongo`)

Install:

pip install pymongo

Connect:

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db = client["mydb"]
collection = db["students"]

Insert:

collection.insert_one({"name": "Riya", "age": 21})

Fetch:

for s in collection.find():
    print(s)

Query:

result = collection.find({"age": {"$gt": 20}})
for r in result:
    print(r)

Update:

collection.update_one({"name": "Riya"}, {"$set": {"age": 22}})

Delete:

collection.delete_one({"name": "Riya"})

13. MongoDB with Flask (Example)

from flask import Flask, request, jsonify
from pymongo import MongoClient

app = Flask(__name__)
client = MongoClient("mongodb://localhost:27017/")
db = client["mydb"]
collection = db["users"]

@app.route("/add", methods=["POST"])
def add_user():
    data = request.json
    collection.insert_one(data)
    return jsonify({"message": "User added successfully"})

@app.route("/users", methods=["GET"])
def get_users():
    users = list(collection.find({}, {"_id": 0}))
    return jsonify(users)

if __name__ == "__main__":
    app.run(debug=True)

Access:

POST /add with JSON body
GET /users to fetch users

14. Data Modeling Best Practices

✅ Use embedded documents for one-to-few relationships
✅ Use references for one-to-many relationships
✅ Keep document size < 16MB
✅ Use indexes on frequently queried fields
✅ Avoid unnecessary nesting

15. Replication & Sharding (Advanced Concepts)

Concept	Description
Replication	Copies data across multiple servers for high availability
Primary Node	Receives all writes
Secondary Node	Copies data from primary
Sharding	Splits large data into horizontal partitions (scaling)

16. MongoDB vs SQL Summary

SQL	MongoDB
Structured schema	Schema-less
Tables, Rows	Collections, Documents
Joins	Embedded/Nested documents
SQL Queries	BSON + MongoDB Query Language
Vertical scaling	Horizontal scaling
Slower for large data	Faster for unstructured data

17. MongoDB Atlas Example (Cloud)

from pymongo import MongoClient

client = MongoClient("mongodb+srv://<username>:<password>@cluster0.mongodb.net/")
db = client["sales_db"]
sales = db["transactions"]

sales.insert_one({"region": "Asia", "amount": 2000})
for doc in sales.find():
    print(doc)

18. Common Commands Recap

Operation	Command
Create DB	`use mydb`
Create Collection	`db.createCollection("users")`
Insert	`db.users.insertOne({name:"Amit"})`
Read	`db.users.find()`
Update	`db.users.updateOne()`
Delete	`db.users.deleteOne()`
Drop Collection	`db.users.drop()`

19. Integrations

MongoDB integrates with:

Flask, Django, FastAPI
Pandas (via pymongo or mongoengine)
ML pipelines (store model metadata)
Airflow, Streamlit, etc.

20. Quick Summary

Topic	Key Point
Type	NoSQL (Document-based)
Format	JSON-like BSON
Query Language	Mongo Query Language (MQL)
Key Libraries	`pymongo`, `mongoengine`
Best Use Case	Dynamic data, APIs, logs, ML storage
Cloud Option	MongoDB Atlas

🔹 Model Context Protocol (MCP) – In Machine Learning

A Model Context Protocol refers to the way information, metadata, or inputs are structured and passed to a machine learning model so that:

The model understands the input properly.
The output can be interpreted or used consistently.

Think of it as the rules of communication between your model and its environment (data pipeline, serving system, or API).

1. Why Do We Need a Model Context Protocol?

Models don’t work in isolation; they need:
- Input format (what data, how structured).
- Context (user info, history, environment).
- Output format (what the model returns, how it’s consumed).
MCP ensures standardization → makes model reusable, debuggable, and deployable.

2. What Does It Include?

A typical model context protocol includes:

Input Schema
- Feature names, types, dimensions.
- Example: {"user_id": int, "age": float, "clicked_items": list}
Context
- Additional info that influences predictions.
- Example: time of day, device type, location.
Model Metadata
- Model version, training data info, assumptions.
- Example: "version": "1.2.3", "trained_on": "MovieLens 1M"
Output Schema
- Structure of prediction.
- Example: {"recommended_item": str, "confidence": float}

3. Example (Recommendation System MCP)

Input Context Protocol:

{
  "user_id": 123,
  "session_features": {
    "time_of_day": "evening",
    "device": "mobile"
  },
  "interaction_history": [45, 67, 89]  // item IDs
}

Model Output:

{
  "recommended_items": [101, 202, 303],
  "confidence_scores": [0.92, 0.85, 0.80],
  "model_version": "v1.0.5"
}

This ensures every service consuming the model knows exactly what to send and expect back.

4. Protocols in Real World

TensorFlow Serving → Uses gRPC/REST with JSON or Protobuf schemas.
TorchServe → Defines handler classes for input-output schemas.
ONNX Runtime → Standardized model format across frameworks.
MLOps Systems (Kubeflow, MLflow, Seldon) → Rely heavily on context protocols for reproducibility.

5. Interview Quick Recap

MCP = contract between model and environment.
Defines input schema, context info, output schema.
Needed for scaling, deploying, and debugging ML models.
Real-world implementations → TensorFlow Serving, TorchServe, ONNX, MLflow.

Agentic AI – Beginner-Friendly Notes

1. What is Agentic AI?

Agentic AI = AI systems that can act as “agents.”
Unlike traditional models (which just take input → give output), agentic AI:
1. Perceives the environment (via data, sensors, APIs).
2. Plans actions (chooses strategy or sequence of steps).
3. Acts on the environment (via tools, APIs, physical systems).
4. Learns & adapts based on feedback.

In short: Agentic AI doesn’t just answer, it does things autonomously.

2. Why is it Important?

Moves AI from passive assistants → active problem-solvers.
Can execute multi-step workflows, not just answer single queries.
Key for autonomous research, robotics, personalized assistants, and business automation.

3. Core Components of Agentic AI

Perception
- Collects information (from text, images, sensors, APIs).
Memory
- Short-term memory (conversation context).
- Long-term memory (stored knowledge, databases).
Planning & Reasoning
- Breaks complex goals into smaller steps.
- Uses chain-of-thought or planning algorithms.
Tools & Actions
- Can call APIs, run code, browse web, query databases.
Feedback & Learning
- Evaluates actions, updates strategy.

4. Techniques Behind Agentic AI

LLM + Tools (Tool-Use)
- LLM calls external tools (calculator, search engine, database).
Reasoning + Planning
- Approaches like Tree of Thoughts, ReAct (Reason + Act).
Multi-Agent Systems
- Several AI agents collaborate (research agent, writing agent, coding agent).
Reinforcement Learning (RL)
- Agents learn optimal actions via trial & error.
Memory Augmentation
- Vector databases (Pinecone, FAISS) to recall past interactions.

5. Examples of Agentic AI

AutoGPT: LLM agent that autonomously executes tasks.
LangChain Agents: Orchestrate LLMs + tools.
ChatGPT with browsing/code interpreter: Uses external tools.
Robotic agents: AI agents that can move robots (self-driving cars, drones).
Enterprise AI Agents: Automate workflows (customer service, report generation).

6. Comparison

Type	Traditional AI	Agentic AI
Input/Output	Fixed Q → A	Dynamic, context-driven
Autonomy	No	Yes
Tool Usage	Limited	Uses APIs, tools
Planning	None	Multi-step reasoning
Adaptability	Low	High

7. Challenges in Agentic AI

Hallucination risk → Wrong actions.
Safety & alignment → Ensure AI follows human values.
Reliability → Needs guardrails to avoid harmful actions.
Scalability → Costly if not optimized.
Evaluation → Harder to test compared to static models.

8. Applications

Personal assistants (schedule meetings, send emails).
Business automation (generate reports, analyze markets).
Research (autonomous discovery, literature review).
Healthcare (monitor patients, suggest treatments).
Robotics (self-driving cars, drones, warehouse robots).

9. Future of Agentic AI

More collaborative AI ecosystems (multi-agent teams).
Safe & explainable reasoning mechanisms.
Integration with IoT & robotics → fully autonomous systems.
Potential to become co-workers, not just tools.

10. Interview Quick Recap

Agentic AI = AI systems that perceive, plan, act, and learn.
Core: Perception, Memory, Planning, Tools, Feedback.
Techniques: ReAct, Tree of Thoughts, RL, Multi-agent.
Examples: AutoGPT, LangChain, ChatGPT (with tools).
Challenges: Hallucinations, safety, evaluation.
Applications: Assistants, automation, robotics, research.

Streamlit Complete Notes (Beginner-Friendly)

1. What is Streamlit?

Streamlit is an open-source Python framework for building interactive web apps for data science, ML, and visualization.
No need to know frontend (HTML/CSS/JS).
Just write Python and deploy as a web app.

2. Installation

pip install streamlit

Check version:

streamlit --version

Run an app:

streamlit run app.py

3. Basic App

app.py

import streamlit as st

st.title("Hello Streamlit")
st.write("This is my first Streamlit app")

Run:

streamlit run app.py

Opens in browser at http://localhost:8501.

4. Streamlit Basics

Text and Titles

st.title("Title")
st.header("Header")
st.subheader("Subheader")
st.text("Simple text")
st.markdown("**Bold with Markdown**")

Data Display

import pandas as pd

df = pd.DataFrame({"Name": ["A", "B"], "Age": [23, 34]})
st.dataframe(df)   # Interactive table
st.table(df)       # Static table

Charts

import matplotlib.pyplot as plt
import numpy as np

st.line_chart(df)  # Simple chart
st.bar_chart(df["Age"])

5. User Input Widgets

name = st.text_input("Enter your name:")
age = st.number_input("Enter age", min_value=0, max_value=100)
gender = st.radio("Select gender", ["Male", "Female"])
hobby = st.multiselect("Choose hobbies", ["Reading", "Gaming", "Sports"])
submit = st.button("Submit")

if submit:
    st.write(f"Hello {name}, Age: {age}, Gender: {gender}, Hobby: {hobby}")

6. File Upload

uploaded_file = st.file_uploader("Upload CSV", type="csv")
if uploaded_file:
    df = pd.read_csv(uploaded_file)
    st.dataframe(df.head())

7. Layouts

Sidebar for navigation:

st.sidebar.title("Options")
choice = st.sidebar.radio("Menu", ["Home", "About"])
if choice == "Home":
    st.write("Welcome to Home")
else:
    st.write("About Page")

Columns:

col1, col2 = st.columns(2)
col1.write("Left Side")
col2.write("Right Side")

Tabs:

tab1, tab2 = st.tabs(["Data", "Charts"])
with tab1:
    st.write("Data Section")
with tab2:
    st.write("Charts Section")

8. Caching (for performance)

@st.cache_data
def load_data():
    df = pd.read_csv("big_data.csv")
    return df

9. Machine Learning Demo

from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier

iris = load_iris()
X, y = iris.data, iris.target
model = RandomForestClassifier()
model.fit(X, y)

st.title("Iris Flower Prediction")

sepal_length = st.slider("Sepal Length", 4.0, 8.0, 5.0)
sepal_width = st.slider("Sepal Width", 2.0, 4.5, 3.0)
petal_length = st.slider("Petal Length", 1.0, 7.0, 4.0)
petal_width = st.slider("Petal Width", 0.1, 2.5, 1.0)

prediction = model.predict([[sepal_length, sepal_width, petal_length, petal_width]])
st.write("Predicted Class:", iris.target_names[prediction][0])

10. Deploy Streamlit App

Use Streamlit Community Cloud (free).
Steps:

Push app code to GitHub.
Go to https://streamlit.io/cloud.
Connect GitHub repo.
Deploy.

Alternative: Deploy on Heroku, Render, AWS, or GCP.

11. Example Mini Project: Sales Dashboard

import pandas as pd
import streamlit as st
import plotly.express as px

st.title("Sales Dashboard")

uploaded_file = st.file_uploader("Upload Sales CSV", type="csv")
if uploaded_file:
    df = pd.read_csv(uploaded_file)
    st.write("Data Preview:", df.head())

    fig = px.line(df, x="Date", y="Sales", title="Sales Over Time")
    st.plotly_chart(fig)

    st.metric("Total Sales", df["Sales"].sum())

12. Key Takeaways

Streamlit = Python → Web App (no frontend needed).
Supports charts, ML models, dashboards, file uploads.
Very beginner-friendly and fast to prototype.
Free hosting on Streamlit Cloud.

Flask Complete Notes (Beginner-Friendly)

1. What is Flask?

Flask is a lightweight Python web framework used to build web applications and APIs.
Known as a “micro-framework” because it doesn’t come with built-in tools like database ORM or authentication — you add only what you need.
Great for beginners, prototyping, and even production apps.

2. Installing Flask

pip install flask

Check installation:

import flask
print(flask.__version__)

3. Basic Flask App

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello, Flask!"

if __name__ == "__main__":
    app.run(debug=True)

Flask(__name__) → creates an app object.
@app.route("/") → defines the URL route.
app.run(debug=True) → starts server with auto-reload and error tracking.

Run app:

python app.py

Visit: http://127.0.0.1:5000

4. Routing

Add multiple pages by defining routes:

@app.route("/about")
def about():
    return "This is the About Page"

Dynamic routes:

@app.route("/user/<name>")
def user(name):
    return f"Hello, {name}!"

5. Templates (HTML Integration)

Flask uses Jinja2 templates to render HTML.

Folder structure:

project/
  app.py
  templates/
    index.html

app.py:

from flask import render_template

@app.route("/")
def home():
    return render_template("index.html", name="Sanjay")

templates/index.html:

<!DOCTYPE html>
<html>
<head><title>Flask App</title></head>
<body>
  <h1>Hello, {{ name }}</h1>
</body>
</html>

6. Static Files (CSS, JS, Images)

Folder structure:

project/
  static/
    style.css

HTML usage:

<link rel="stylesheet" href="{{ url_for('static', filename='style.css') }}">

7. Forms and User Input

from flask import request

@app.route("/login", methods=["GET", "POST"])
def login():
    if request.method == "POST":
        username = request.form["username"]
        return f"Welcome, {username}"
    return '''
        <form method="post">
            <input name="username">
            <input type="submit">
        </form>
    '''

8. REST API with Flask

from flask import jsonify

@app.route("/api/data")
def get_data():
    return jsonify({"name": "Sanjay", "role": "Data Scientist"})

9. Flask with Database (SQLite Example)

import sqlite3
from flask import g

DATABASE = "test.db"

def get_db():
    db = getattr(g, "_database", None)
    if db is None:
        db = g._database = sqlite3.connect(DATABASE)
    return db

@app.route("/add")
def add_data():
    db = get_db()
    db.execute("INSERT INTO users (name) VALUES (?)", ("Sanjay",))
    db.commit()
    return "User added!"

10. Flask Extensions (Popular)

Flask-SQLAlchemy → Database ORM
Flask-Login → Authentication
Flask-RESTful → Build APIs easily
Flask-WTF → Form handling
Flask-Mail → Send emails

11. Deploying Flask

Local run: python app.py
Production (Gunicorn):

pip install gunicorn
gunicorn -w 4 app:app

Can deploy on Heroku, Render, AWS, GCP, or Railway.

12. Mini Project Example (Hello API + Webpage)

app.py:

from flask import Flask, render_template, jsonify

app = Flask(__name__)

@app.route("/")
def home():
    return render_template("index.html", name="Flask Learner")

@app.route("/api")
def api():
    return jsonify({"message": "This is Flask API!"})

if __name__ == "__main__":
    app.run(debug=True)

index.html:

<h1>Welcome {{ name }}</h1>
<p>Check API at <a href="/api">/api</a></p>

13. Key Points Summary

Flask = Lightweight, flexible Python web framework.
Uses routes for pages and APIs.
Templates + static files = frontend support.
Extensions add extra power (DB, login, forms).
Easy to deploy anywhere.

Perfect ⚡ — FastAPI is a must-learn framework for anyone working in Data Science, ML, or backend APIs because it’s modern, fast, and production-ready.
Here’s your complete, beginner-friendly, end-to-end FastAPI notes 👇

FastAPI Complete Notes (Beginner Friendly)

1. What is FastAPI?

FastAPI is a modern, fast (high-performance) Python framework used for building APIs and backend services.

✅ Built on Starlette (for web) and Pydantic (for data validation)
✅ Designed for speed and type safety
✅ Ideal for Machine Learning APIs, microservices, and real-time data systems

2. Why Use FastAPI?

Feature	Description
🚀 Fast	Built on ASGI → handles requests asynchronously
🧠 Data validation	Uses Pydantic models for strict schema validation
⚙️ Automatic docs	Swagger UI and Redoc auto-generated
🔐 Easy integration	Works well with SQL, NoSQL, ML, and OAuth2
📦 Modern syntax	Type hints and async/await supported
🧩 Great for ML	Perfect for deploying ML models as REST APIs

3. Installation

pip install fastapi uvicorn

✅ fastapi → API framework
✅ uvicorn → ASGI server to run your app

4. Create Your First FastAPI App

📄 main.py

from fastapi import FastAPI

app = FastAPI()

@app.get("/")
def home():
    return {"message": "Hello, FastAPI!"}

Run the app:

uvicorn main:app --reload

Now visit:
👉 http://127.0.0.1:8000 → API output
👉 http://127.0.0.1:8000/docs → Swagger UI
👉 http://127.0.0.1:8000/redoc → ReDoc UI

5. HTTP Methods

Method	Usage	Example
GET	Read data	`/users`
POST	Create new data	`/users`
PUT	Update entire record	`/users/{id}`
PATCH	Update part of record	`/users/{id}`
DELETE	Delete record	`/users/{id}`

Example:

@app.get("/items/{item_id}")
def get_item(item_id: int):
    return {"item_id": item_id}

6. Query Parameters

@app.get("/search/")
def search_items(q: str, limit: int = 5):
    return {"query": q, "limit": limit}

➡️ Access like: http://127.0.0.1:8000/search?q=apple&limit=10

7. Request Body with Pydantic

Used for validating and structuring input JSON.

from pydantic import BaseModel

class Item(BaseModel):
    name: str
    price: float
    in_stock: bool = True

@app.post("/items/")
def create_item(item: Item):
    return {"item_name": item.name, "price": item.price}

Input JSON example:

{
  "name": "Laptop",
  "price": 80000
}

8. Path Parameters + Validation

from fastapi import Path

@app.get("/users/{user_id}")
def read_user(user_id: int = Path(..., gt=0, description="User ID must be > 0")):
    return {"user_id": user_id}

9. Handling Query + Path Together

@app.get("/products/{product_id}")
def read_product(product_id: int, q: str | None = None):
    if q:
        return {"product_id": product_id, "query": q}
    return {"product_id": product_id}

10. Response Models (Structured Output)

class User(BaseModel):
    id: int
    name: str
    email: str

@app.get("/user/{id}", response_model=User)
def get_user(id: int):
    return {"id": id, "name": "Sanjay", "email": "sanjay@example.com"}

11. Handling Errors

from fastapi import HTTPException

@app.get("/divide")
def divide(a: float, b: float):
    if b == 0:
        raise HTTPException(status_code=400, detail="Division by zero not allowed")
    return {"result": a / b}

12. Dependency Injection

Used to manage reusable logic like authentication, DB connections, etc.

from fastapi import Depends

def get_token_header(token: str):
    if token != "abc123":
        raise HTTPException(status_code=403, detail="Invalid token")
    return token

@app.get("/secure-data/")
def read_secure_data(token: str = Depends(get_token_header)):
    return {"data": "Secure content!"}

13. Middleware

Middleware intercepts requests/responses.

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"],
)

14. Connect FastAPI with MongoDB

from fastapi import FastAPI
from pymongo import MongoClient
from pydantic import BaseModel

app = FastAPI()
client = MongoClient("mongodb://localhost:27017/")
db = client["fastapi_db"]
collection = db["users"]

class User(BaseModel):
    name: str
    age: int

@app.post("/add_user")
def add_user(user: User):
    collection.insert_one(user.dict())
    return {"message": "User added"}

@app.get("/users")
def get_users():
    users = list(collection.find({}, {"_id": 0}))
    return {"users": users}

15. FastAPI + Machine Learning Example

from fastapi import FastAPI
from pydantic import BaseModel
import pickle
import numpy as np

app = FastAPI()

# Load model
model = pickle.load(open("model.pkl", "rb"))

class InputData(BaseModel):
    feature1: float
    feature2: float
    feature3: float

@app.post("/predict")
def predict(data: InputData):
    features = np.array([[data.feature1, data.feature2, data.feature3]])
    prediction = model.predict(features)
    return {"prediction": float(prediction[0])}

Run with:

uvicorn main:app --reload

➡️ Try it at http://127.0.0.1:8000/docs

16. Include Routers (for large projects)

from fastapi import APIRouter

router = APIRouter()

@router.get("/info")
def info():
    return {"info": "Sub-router working!"}

app.include_router(router, prefix="/api")

17. Authentication (Basic Example)

from fastapi.security import OAuth2PasswordBearer

oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")

@app.get("/secure/")
def secure_data(token: str = Depends(oauth2_scheme)):
    return {"token": token}

18. Static Files & Templates

from fastapi.staticfiles import StaticFiles
from fastapi.templating import Jinja2Templates
from fastapi import Request

app.mount("/static", StaticFiles(directory="static"), name="static")
templates = Jinja2Templates(directory="templates")

@app.get("/home")
def home(request: Request):
    return templates.TemplateResponse("index.html", {"request": request})

19. Common FastAPI Commands

Command	Description
`uvicorn main:app --reload`	Run development server
`/docs`	Swagger documentation
`/redoc`	Alternative documentation
`Ctrl+C`	Stop server
`pip install "uvicorn[standard]"`	Install complete server dependencies

20. Best Practices

✅ Use Pydantic models for input/output
✅ Keep routers in separate files for modular code
✅ Add CORS middleware for frontend integration
✅ Implement logging & error handling
✅ Use async functions for I/O-heavy operations
✅ Deploy using Docker / Gunicorn / Uvicorn

21. Deployment (Production)

Option 1: Using Uvicorn + Gunicorn

pip install "uvicorn[standard]" gunicorn
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker

Option 2: Docker

FROM python:3.10
WORKDIR /app
COPY . .
RUN pip install fastapi uvicorn
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

22. Integration Possibilities

Tool	Integration Use
MongoDB / SQLAlchemy	Database
Pandas / Numpy	Data analysis
Scikit-learn / XGBoost	ML model prediction APIs
Streamlit / React	Frontend UI
Docker / K8s	Deployment
Prometheus	API performance monitoring

Awesome 🔥 — PySpark is one of the most powerful tools for big data processing, ETL pipelines, and distributed ML.
Here’s your complete, beginner-friendly, end-to-end PySpark notes 👇

PySpark Complete Notes (Beginner Friendly)

1. What is PySpark?

PySpark is the Python API for Apache Spark, a powerful open-source framework used for big data processing, analysis, and machine learning across distributed clusters.

✅ Built on Apache Spark
✅ Handles large-scale data (GBs → TBs) efficiently
✅ Works on clusters (parallel computation)
✅ Supports DataFrames, SQL, MLlib, Streaming

2. Why Use PySpark?

Feature	Description
⚡ Speed	100x faster than traditional MapReduce
🧩 Scalable	Handles terabytes/petabytes of data
🧠 Easy API	Python-like DataFrame operations
💾 Supports multiple data sources	CSV, JSON, Parquet, HDFS, S3
🤖 Machine Learning support	Spark MLlib
☁️ Integrates with	AWS EMR, Databricks, Hadoop, Google Dataproc

3. PySpark Architecture

+----------------------------------------------------------+
|                        PySpark                           |
|----------------------------------------------------------|
| Driver Program (main code)                               |
|   ↓                                                      |
| SparkContext → Cluster Manager → Executors (workers)     |
| Each executor runs tasks on partitions of data            |
+----------------------------------------------------------+

Driver: Your Python script that controls the job.
Executor: Worker nodes that perform computations.
Cluster Manager: Allocates resources (e.g., YARN, Mesos, Kubernetes).

4. Installation

Local Installation:

pip install pyspark

Check version:

import pyspark
print(pyspark.__version__)

5. Starting PySpark Session

from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("MyFirstSparkApp") \
    .getOrCreate()

print(spark)

To stop session:

spark.stop()

6. Create a DataFrame

From Python data:

data = [("Alice", 25), ("Bob", 30), ("Cathy", 27)]
columns = ["Name", "Age"]

df = spark.createDataFrame(data, columns)
df.show()

Output:

+-----+---+
| Name|Age|
+-----+---+
|Alice| 25|
|  Bob| 30|
|Cathy| 27|
+-----+---+

7. Read / Write Data

Operation	Example
Read CSV	`df = spark.read.csv("data.csv", header=True, inferSchema=True)`
Read JSON	`df = spark.read.json("data.json")`
Read Parquet	`df = spark.read.parquet("data.parquet")`
Write CSV	`df.write.csv("output/", header=True)`

8. Basic DataFrame Operations

df.printSchema()   # View schema
df.columns         # Get column names
df.describe().show()  # Summary stats
df.select("Name").show()
df.filter(df.Age > 25).show()
df.groupBy("Age").count().show()
df.orderBy("Age", ascending=False).show()

9. Add / Rename / Drop Columns

from pyspark.sql.functions import col, lit

df = df.withColumn("Country", lit("India"))         # Add column
df = df.withColumnRenamed("Age", "Years")           # Rename column
df = df.drop("Country")                             # Drop column

10. Handling Missing Data

df.na.drop().show()                    # Drop null rows
df.na.fill({"Age": 0}).show()          # Fill nulls
df.na.replace("Unknown", "N/A").show() # Replace values

11. SQL with PySpark

df.createOrReplaceTempView("people")

result = spark.sql("SELECT Name, Age FROM people WHERE Age > 25")
result.show()

12. PySpark Functions

Import frequently used functions:

from pyspark.sql.functions import *

df.select(upper(col("Name")), col("Age") + 5).show()
df.withColumn("AgeGroup", when(col("Age") > 25, "Adult").otherwise("Young")).show()

13. Joins in PySpark

df1.join(df2, on="id", how="inner")
df1.join(df2, on="id", how="left")
df1.join(df2, on="id", how="right")
df1.join(df2, on="id", how="outer")

14. Aggregations

df.groupBy("Country").agg(
    count("*").alias("Count"),
    avg("Age").alias("AvgAge")
).show()

15. Working with Dates

from pyspark.sql.functions import current_date, year, month, dayofmonth

df = df.withColumn("today", current_date())
df = df.withColumn("year", year(col("today")))
df.show()

16. User Defined Functions (UDFs)

from pyspark.sql.functions import udf
from pyspark.sql.types import StringType

def greeting(name):
    return "Hello " + name

greet_udf = udf(greeting, StringType())
df = df.withColumn("Greet", greet_udf(col("Name")))
df.show()

17. Machine Learning with PySpark (MLlib)

Example: Linear Regression

from pyspark.ml.regression import LinearRegression
from pyspark.ml.feature import VectorAssembler

data = [(1, 2.0, 3.0), (2, 3.0, 5.0), (3, 4.0, 7.0)]
columns = ["id", "feature", "label"]
df = spark.createDataFrame(data, columns)

assembler = VectorAssembler(inputCols=["feature"], outputCol="features")
train_data = assembler.transform(df)

lr = LinearRegression(featuresCol="features", labelCol="label")
model = lr.fit(train_data)

print(model.coefficients, model.intercept)

18. PySpark with Pandas

Convert Spark DataFrame to Pandas:

pandas_df = df.toPandas()

Convert Pandas DataFrame to Spark:

spark_df = spark.createDataFrame(pandas_df)

19. Partitioning & Parallelism

Spark divides data into partitions to process in parallel.
Check partitions:
```
df.rdd.getNumPartitions()
```
Repartition:
```
df = df.repartition(4)
```

20. Saving & Loading Models

model.save("lr_model")
loaded_model = LinearRegression.load("lr_model")

21. Integration with AWS / GCP

Platform	Method
AWS S3	`spark.read.csv("s3a://bucket/file.csv")`
Google Cloud Storage	`spark.read.csv("gs://bucket/file.csv")`
Hadoop HDFS	`spark.read.csv("hdfs://path/file.csv")`

22. Performance Tips

✅ Use Parquet instead of CSV (columnar & compressed)
✅ Use filter() early (predicate pushdown)
✅ Cache DataFrames with .cache() for reuse
✅ Avoid too many small files
✅ Use broadcast joins for small lookup tables

23. PySpark Data Types

PySpark Type	Equivalent Python Type
`StringType()`	str
`IntegerType()`	int
`DoubleType()`	float
`BooleanType()`	bool
`TimestampType()`	datetime

Example:

from pyspark.sql.types import StructType, StructField, StringType, IntegerType

schema = StructType([
    StructField("Name", StringType(), True),
    StructField("Age", IntegerType(), True)
])
df = spark.createDataFrame(data, schema)

24. Common PySpark Functions

Function	Purpose
`col()`	Access column
`lit()`	Add constant value
`when()`	Conditional column
`count(), sum(), avg()`	Aggregations
`regexp_extract()`	Regex matching
`concat_ws()`	String concatenation
`explode()`	Flatten array column

25. Example: End-to-End ETL Pipeline

from pyspark.sql import SparkSession
from pyspark.sql.functions import *

spark = SparkSession.builder.appName("ETL Example").getOrCreate()

# Read
df = spark.read.csv("sales.csv", header=True, inferSchema=True)

# Transform
df_clean = df.filter(col("Amount").isNotNull())
df_final = df_clean.groupBy("Region").agg(sum("Amount").alias("TotalSales"))

# Load
df_final.write.csv("output/sales_summary", header=True)

26. Spark MLlib Use Cases

Regression: Linear, Logistic
Classification: Decision Trees, Random Forests
Clustering: K-Means
Feature Engineering: VectorAssembler, StandardScaler
Pipelines: Combine multiple transformations

27. PySpark vs Pandas

Feature	Pandas	PySpark
Scale	Small data (in-memory)	Big data (distributed)
Speed	Single machine	Multi-node cluster
API	Easy & rich	Similar syntax
Use Case	EDA	ETL, ML on big data

28. Common Use Cases

✅ ETL on large datasets
✅ Feature engineering for ML
✅ Log analysis
✅ Data cleaning at scale
✅ Joining datasets across clusters

Kubernetes Complete Notes (Beginner-Friendly)

1. What is Kubernetes?

Kubernetes (K8s) is an open-source platform for automating deployment, scaling, and management of containerized applications (like Docker containers).
It helps you run applications reliably across clusters of machines (physical or virtual).
Originally developed by Google, now maintained by the Cloud Native Computing Foundation (CNCF).

2. Why Use Kubernetes?

✅ Automatic scaling of apps
✅ Self-healing — restarts crashed containers
✅ Load balancing between containers
✅ Rolling updates for zero downtime
✅ Portability — works on any cloud or on-prem

3. Basic Terminology

Concept	Description
Cluster	Set of nodes (machines) managed by Kubernetes
Node	A worker machine (physical or VM) that runs pods
Pod	The smallest deployable unit — one or more containers
Container	Application running inside Docker (or similar runtime)
Service	Exposes pods to the network (for communication)
Deployment	Manages replicas of pods and ensures desired state
Namespace	Logical grouping of resources (like folders)
Ingress	Manages external access (HTTP/HTTPS) to services
ConfigMap / Secret	Store configuration or sensitive data separately

4. Architecture Overview

+-----------------------------------------------------------+
|                     Kubernetes Cluster                    |
|-----------------------------------------------------------|
|  Control Plane (Master Node)                              |
|    • kube-apiserver  → Handles all requests (API)         |
|    • etcd            → Key-value store for cluster data   |
|    • scheduler       → Assigns pods to worker nodes       |
|    • controller-mgr  → Monitors cluster state             |
|-----------------------------------------------------------|
|  Worker Nodes                                           |
|    • kubelet        → Communicates with control plane     |
|    • kube-proxy      → Networking for pods                |
|    • container runtime (Docker/Containerd)                |
+-----------------------------------------------------------+

5. Installation (Local Setup)

Option 1: Minikube (for local testing)

# Install Minikube (on Ubuntu/macOS)
brew install minikube     # macOS
choco install minikube    # Windows

# Start cluster
minikube start

# Verify setup
kubectl get nodes

Option 2: Cloud Providers

Google Kubernetes Engine (GKE)
AWS Elastic Kubernetes Service (EKS)
Azure Kubernetes Service (AKS)

6. Core Kubernetes Components

🧩 Pods

apiVersion: v1
kind: Pod
metadata:
  name: myapp-pod
spec:
  containers:
    - name: myapp
      image: nginx
      ports:
        - containerPort: 80

Deploy:

kubectl apply -f pod.yaml
kubectl get pods
kubectl describe pod myapp-pod

🌀 Deployment

Used to manage and scale pods automatically.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: myapp
          image: nginx
          ports:
            - containerPort: 80

Deploy and check:

kubectl apply -f deployment.yaml
kubectl get deployments
kubectl get pods

🌐 Service

Exposes deployment to network (internal or external).

apiVersion: v1
kind: Service
metadata:
  name: myapp-service
spec:
  type: NodePort
  selector:
    app: myapp
  ports:
    - port: 80
      targetPort: 80
      nodePort: 30001

Check service:

kubectl get svc
minikube service myapp-service

7. Scaling

kubectl scale deployment myapp-deployment --replicas=5
kubectl get pods

8. Rolling Updates

kubectl set image deployment/myapp-deployment myapp=nginx:latest
kubectl rollout status deployment/myapp-deployment

Rollback:

kubectl rollout undo deployment/myapp-deployment

9. ConfigMaps & Secrets

ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  APP_MODE: "production"

Secret:

apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
data:
  DB_PASSWORD: cGFzc3dvcmQ=   # base64 encoded

Mount in pod:

envFrom:
  - configMapRef:
      name: app-config
  - secretRef:
      name: db-secret

10. Namespaces

kubectl create namespace dev
kubectl get namespaces
kubectl apply -f app.yaml -n dev

11. Logs & Monitoring

kubectl logs pod_name
kubectl describe pod pod_name
kubectl top pods      # if metrics-server installed

Popular tools:

Prometheus + Grafana (metrics)
ELK Stack (logs)
Lens (GUI dashboard)

12. Real-World Example Flow

Dockerize your ML/Web app → Dockerfile
Push image to Docker Hub or private registry
Create deployment.yaml and service.yaml

Apply configs:

kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

Scale if needed:

kubectl scale deployment myapp --replicas=4

Access app:
```
minikube service myapp-service
```

13. Useful Commands

Command	Description
`kubectl get pods`	List all pods
`kubectl get svc`	List all services
`kubectl describe pod <name>`	Pod details
`kubectl delete pod <name>`	Delete pod
`kubectl logs <pod>`	Show logs
`kubectl apply -f file.yaml`	Apply configuration
`kubectl exec -it <pod> -- bash`	Access container shell

14. Kubernetes vs Docker

Feature	Docker	Kubernetes
Scope	Containerization	Orchestration
Scale	Single host	Multi-host clusters
Self-healing	No	Yes
Load Balancing	Manual	Automatic
Configuration	Docker CLI	YAML manifests

15. Key Takeaways

Kubernetes = container orchestrator for scaling & managing apps.
Works hand-in-hand with Docker.
Core concepts: Pod, Deployment, Service, ConfigMap, Namespace.
Ideal for ML model serving, microservices, and production apps.
Learn kubectl commands + YAML basics to get started fast.

Prometheus Complete Notes (Beginner-Friendly)

1. What is Prometheus?

Prometheus is an open-source monitoring and alerting toolkit designed for time-series data (metrics that change over time).
Commonly used for:
- Monitoring applications, infrastructure, and Kubernetes clusters.
- Setting alerts when performance issues or failures occur.
- Visualizing metrics in Grafana dashboards.

2. Key Features

✅ Time-series database
✅ Pull-based metrics collection (scrapes from targets)
✅ Multi-dimensional data model
✅ Powerful query language — PromQL
✅ Integrates easily with Grafana
✅ Lightweight and easy to set up

3. How Prometheus Works (Architecture)

+------------------------------------------------------+
|                     Prometheus                       |
|------------------------------------------------------|
|  1. Targets (exporters, apps, K8s nodes)             |
|  2. Scrapes metrics via HTTP endpoints (/metrics)    |
|  3. Stores data as time-series in local DB           |
|  4. Query metrics via PromQL                         |
|  5. Triggers alerts (Alertmanager)                   |
|  6. Visualize with Grafana                           |
+------------------------------------------------------+

4. Key Components

Component	Description
Prometheus Server	Collects and stores metrics data
Exporters	Expose metrics in Prometheus format
Alertmanager	Sends notifications (Email, Slack, etc.)
PromQL	Query language for analyzing metrics
Pushgateway	For short-lived jobs that push metrics
Grafana	For dashboard visualization

5. Installation (Local Setup)

Step 1: Download Prometheus

Go to https://prometheus.io/download
Extract and run:

./prometheus --config.file=prometheus.yml

Visit: http://localhost:9090

6. Configuration File (`prometheus.yml`)

This file defines what to monitor (targets) and how often to scrape.

Example:

global:
  scrape_interval: 15s   # How often to collect metrics

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "myapp"
    static_configs:
      - targets: ["localhost:8000"]

Here:

Prometheus scrapes metrics from itself (9090) and your app (8000).

7. Exporters (for Different Systems)

Exporters are lightweight programs that expose metrics.

Exporter	Purpose
Node Exporter	OS-level metrics (CPU, RAM, Disk)
cAdvisor	Container metrics
Kube State Metrics	Kubernetes cluster info
Blackbox Exporter	Endpoint uptime check
MySQL/Postgres Exporter	Database metrics
JMX Exporter	Java apps (JVM metrics)

Run Node Exporter:

./node_exporter

Access metrics: http://localhost:9100/metrics

8. Integrating Prometheus with Python / Flask App

Step 1: Install client library

pip install prometheus-client

Step 2: Add metrics endpoint in your app

from flask import Flask, Response
from prometheus_client import Counter, generate_latest

app = Flask(__name__)

REQUEST_COUNT = Counter('request_count', 'Total web requests')

@app.route("/")
def home():
    REQUEST_COUNT.inc()
    return "Hello, Prometheus!"

@app.route("/metrics")
def metrics():
    return Response(generate_latest(), mimetype="text/plain")

if __name__ == "__main__":
    app.run(port=8000)

Now Prometheus can scrape metrics from /metrics endpoint every few seconds.

9. Querying Metrics with PromQL

PromQL = Prometheus Query Language

Common examples:

Query	Meaning
`up`	Show status (1 = up, 0 = down) of all targets
`node_cpu_seconds_total`	Total CPU time
`rate(http_requests_total[5m])`	Requests per second (last 5 minutes)
`sum(rate(http_requests_total[1m])) by (instance)`	Requests per instance
`avg_over_time(cpu_usage[10m])`	Average over last 10 minutes

You can run these queries at http://localhost:9090/graph.

10. Setting Alerts

Add alerting rules:

groups:
- name: alert.rules
  rules:
  - alert: HighCPUUsage
    expr: rate(node_cpu_seconds_total[1m]) > 0.8
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "High CPU usage detected"

Start Prometheus with Alertmanager:

./prometheus --config.file=prometheus.yml

Alertmanager can notify:

Email
Slack
PagerDuty
Telegram

11. Visualization with Grafana

Install Grafana → https://grafana.com/grafana/download
Open Grafana at http://localhost:3000
Add Prometheus as a data source:
- URL: http://localhost:9090
Import dashboards (Node, Kubernetes, App metrics)

Now you can see live charts, e.g. CPU, memory, app response time.

12. Prometheus with Kubernetes

In Kubernetes, Prometheus monitors pods, nodes, and services.

You can deploy it easily using:

kubectl create namespace monitoring
kubectl apply -f https://github.com/prometheus-operator/kube-prometheus/releases/latest/download/manifests/setup

Or use Helm:

helm install prometheus prometheus-community/prometheus

This installs:

Prometheus server
Alertmanager
Node exporter
kube-state-metrics

Then access:

kubectl port-forward svc/prometheus-server 9090:80 -n monitoring

13. Common Use Cases

Monitor Docker / Kubernetes clusters
Track ML model latency & prediction counts
Set alerts for high memory usage or downtime
Integrate with Grafana dashboards for live monitoring
Observe system health trends over time

14. Key Takeaways

Concept	Description
Prometheus	Monitoring + alerting system
Metrics endpoint	`/metrics` — exposes time-series data
PromQL	Query and analyze data
Exporters	Provide metrics for different systems
Alertmanager	Triggers alerts on defined conditions
Grafana	Visualization tool for Prometheus data

15. Mini Example: Full Flow Recap

Run a Python/Flask app with /metrics endpoint

Install Prometheus and configure:

- job_name: 'flask'
  static_configs:
    - targets: ['localhost:8000']

Start Prometheus:

./prometheus --config.file=prometheus.yml

Open http://localhost:9090
Query: request_count_total
Add Grafana dashboard → visualize in charts.

16. Tools That Work With Prometheus

Grafana → dashboards
Alertmanager → notifications
Thanos → long-term storage
VictoriaMetrics → scalable alternative
Prometheus Operator → easy setup in Kubernetes

17. Quick Commands

Command	Description
`./prometheus --config.file=prometheus.yml`	Start Prometheus
`kubectl port-forward svc/prometheus-server 9090:80`	Access in K8s
`curl localhost:9090/metrics`	Check metrics endpoint
`systemctl status prometheus`	Check service status (Linux)

Here’s a beginner-friendly, short and complete notes summary on Grafana 👇

🧭 Grafana – Short Notes

🌐 1. What is Grafana?

Grafana is an open-source visualization and monitoring tool used to analyze metrics, logs, and traces from various data sources.
It helps you create interactive dashboards to monitor system performance, infrastructure, and applications.

⚙️ 2. Key Features

Feature	Description
📊 Dashboards	Visualize time-series data in real time
🧩 Plugins	Extend functionality (data sources, panels, apps)
🔔 Alerts	Set thresholds and receive alerts via email, Slack, etc.
🗂️ Data Sources	Connect to Prometheus, InfluxDB, ElasticSearch, Loki, MySQL, etc.
👥 User Management	Role-based access control
🧠 Templating	Dynamic, parameterized dashboards

🧱 3. Grafana Architecture

+-------------------+
|  Web UI (Dashboards) |
+----------+--------+
           |
+----------v--------+
|   Backend Server  |
|  (API + Logic)    |
+----------+--------+
           |
+----------v--------+
|  Data Sources     |
| (Prometheus, DBs) |
+-------------------+

Frontend (UI): Displays dashboards
Backend: Handles authentication, alerts, queries
Data Sources: Provide time-series or metric data

🔌 4. Common Data Sources

Prometheus – metrics monitoring
Loki – log aggregation
InfluxDB – time-series data
Elasticsearch – search + analytics
MySQL / PostgreSQL – SQL databases
Cloud Sources – AWS CloudWatch, Azure Monitor, GCP Stackdriver

🧭 5. Installing Grafana (Quick Setup)

On Ubuntu / Debian:

sudo apt-get install -y apt-transport-https
sudo apt-get install -y software-properties-common wget
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
sudo apt-get update
sudo apt-get install grafana
sudo systemctl start grafana-server
sudo systemctl enable grafana-server

Access at: http://localhost:3000
Default credentials:
user: admin / password: admin

🧩 6. Creating Dashboards

Login → Click “+” → Dashboard → Add new panel
Choose a data source (e.g., Prometheus)
Write a query (e.g., up, cpu_usage_total)
Choose visualization type (Graph, Gauge, Table, etc.)
Save dashboard

🔔 7. Alerts & Notifications

Add alert on a panel → Set condition (e.g., CPU > 80%)
Configure Notification Channel (Slack, Email, PagerDuty)
Alert Rules can be viewed & managed centrally

🧱 8. Panels & Visualizations

Type	Use
Time Series	Continuous data (CPU, memory)
Gauge	Current metric value
Bar Gauge	Compare multiple values
Table	Tabular data
Stat	Single numeric indicator
Heatmap	Distribution visualization

🧰 9. Variables (Templating)

Create dynamic dashboards with dropdowns.
Example:

$server → all available server names
$metric → all available metrics

Used in query as:

avg(cpu_usage{instance="$server"})

🧠 10. Grafana + Prometheus Workflow

Prometheus collects metrics from servers/applications
Grafana connects to Prometheus as data source
Dashboards visualize time-series metrics
Alerts notify when thresholds are crossed

🛡️ 11. Authentication & Roles

Admin – full control
Editor – can modify dashboards
Viewer – read-only access

Supports:

LDAP, OAuth, Google, Azure AD, GitHub authentication

☁️ 12. Cloud & Enterprise Versions

Type	Description
Grafana OSS	Free open-source
Grafana Cloud	Hosted SaaS version
Grafana Enterprise	Adds support, SSO, auditing

🧩 13. Integration Examples

Prometheus + Grafana → system metrics
Loki + Grafana → centralized log dashboard
Tempo + Grafana → distributed tracing
MySQL + Grafana → business analytics

🚀 14. Common Use Cases

✅ Infrastructure & server monitoring
✅ Application performance tracking
✅ Business KPIs visualization
✅ Log + Metric correlation (via Loki)
✅ Cloud resource monitoring

🧾 15. Grafana Query Examples

PromQL (Prometheus):

node_cpu_seconds_total{mode="idle"}
avg(rate(http_requests_total[5m]))

InfluxQL (InfluxDB):

SELECT mean("usage") FROM "cpu" WHERE time > now() - 1h GROUP BY time(1m)

🧩 16. Short Commands & Ports

Command	Purpose
`sudo systemctl start grafana-server`	Start service
`sudo systemctl stop grafana-server`	Stop service
`sudo systemctl status grafana-server`	Check status
Default Port:	`3000`

Tensorflow, Pytorch, Flask, Fast API, Mongo DB, Agentic AI, Recommendation engines, Kubernates, Spark, Grafana

TensorFlow – Beginner-Friendly Complete Notes

1. Introduction to TensorFlow

2. Installation

3. Basic Building Blocks

a) Tensors

b) Variables

c) Operations

4. TensorFlow vs NumPy

5. TensorFlow Workflow (Step by Step)

Step 1: Import Data

Step 2: Preprocess Data

Step 3: Build Model

Step 4: Compile Model

Step 5: Train Model

Step 6: Evaluate

6. TensorFlow Model APIs

a) Sequential API

b) Functional API

c) Subclassing API

7. Common Layers

8. Training Essentials

Optimizers

Loss Functions

Metrics

9. Callbacks

10. Saving and Loading Models

11. TensorBoard (Visualization)

12. TensorFlow Ecosystem

13. Example: End-to-End Classification

14. Tips for Beginners

15. Interview Quick Recap

PyTorch – Beginner-Friendly Complete Notes

1. Introduction to PyTorch

2. Installation

3. Core Building Blocks

a) Tensors

b) Operations

c) Autograd (Automatic Differentiation)

4. PyTorch vs TensorFlow

5. Workflow in PyTorch

Step 1: Dataset

Step 2: DataLoader

Step 3: Define Model

Step 4: Define Loss and Optimizer

Step 5: Training Loop

Step 6: Evaluation

6. PyTorch Components

a) Tensors

b) Autograd

c) Optimizers

d) Loss Functions

e) Modules & Layers

7. GPU/Device Management

8. Saving and Loading Models

9. PyTorch Ecosystem

10. Example: End-to-End Classification

11. Tips for Beginners

12. Interview Quick Recap

MongoDB Complete Notes (Beginner Friendly)

1. What is MongoDB?

2. Key Features

3. Basic Concepts

4. MongoDB Architecture Overview

5. Installation

🧩 Option 1: Local Setup

🧩 Option 2: MongoDB Atlas (Cloud)

6. MongoDB Data Format Example

7. Basic MongoDB Commands

8. CRUD Operations

➤ Create

➤ Read

➤ Update

➤ Delete

9. Query Operators

10. Indexing

11. Aggregation Framework

12. Connection with Python (pymongo)

Install:

Connect:

12. Connection with Python (`pymongo`)