MLOPS - 2 Interview questions

 Here are 100 questions and answers on model packaging, reproducibility, and deployment, tailored for a fresher's understanding.

Model Packaging & Reproducibility (Q1-Q40) πŸ“¦

  1. Q: What is the main goal of model packaging in MLOps?

    • A: To bundle a trained machine learning model along with all its dependencies and metadata into a single, portable, and runnable artifact.

  2. Q: Why is model packaging essential for deployment?

    • A: It ensures that the model can be deployed and run in a production environment consistently and reliably, without compatibility issues.

  3. Q: What is a "reproducible environment"?

    • A: A reproducible environment is an environment where you can get the exact same results from the same code, data, and configuration every time.

  4. Q: Why is reproducibility difficult in ML projects?

    • A: It's challenging due to varying library versions, different operating systems, and changes to the data used for training.

  5. Q: How does packaging help with reproducibility?

    • A: By capturing all the necessary dependencies, it guarantees that the model will run the same way in any environment, whether it's for training or serving.

  6. Q: What is a "virtual environment"?

    • A: A virtual environment is an isolated directory that contains a specific Python interpreter and its installed packages, separate from the system's global packages.

  7. Q: How does a virtual environment contribute to reproducibility?

    • A: It ensures that the model and its dependencies are isolated, so changes to other projects don't affect it.

  8. Q: What is the purpose of a requirements.txt file?

    • A: It's a text file that lists all the required Python packages and their specific versions for a project.

  9. Q: What are some limitations of virtual environments for production?

    • A: They can't manage non-Python dependencies and may still lead to inconsistencies across different operating systems.

  10. Q: What is the most popular tool for creating reproducible environments in MLOps?

    • A: Docker.

  11. Q: What is Docker?

    • A: Docker is a platform that uses OS-level virtualization to deliver software in packages called containers.

  12. Q: What is a "Docker image"?

    • A: A Docker image is a read-only template that contains all the necessary instructions and dependencies to create a Docker container.

  13. Q: What is a "Docker container"?

    • A: A Docker container is a live, runnable instance of a Docker image. It's an isolated process that runs the application.

  14. Q: What is the purpose of a Dockerfile?

    • A: A Dockerfile is a script containing instructions to build a Docker image. It specifies the base image, copies files, and installs dependencies.

  15. Q: What command do you use to build a Docker image?

    • A: docker build -t my-model-app .

  16. Q: What command do you use to run a Docker container?

    • A: docker run my-model-app

  17. Q: How does Docker solve the "it works on my machine" problem?

    • A: It packages the entire application and its environment into a single, portable container, ensuring it runs consistently everywhere.

  18. Q: What are the main components to include in a model package?

    • A: The trained model file itself, a script for inference, a requirements.txt file, and a Dockerfile.

  19. Q: How can you reduce the size of a Docker image for a model?

    • A: Use a smaller base image (e.g., alpine), combine commands to reduce layers, and use multi-stage builds.

  20. Q: What is the benefit of a "multi-stage build" in Docker?

    • A: It allows you to use a large image for building the application and a smaller, leaner image for the final production container, reducing its size.

  21. Q: What is the purpose of a model.pkl file?

    • A: It's a binary file created using Python's pickle library to serialize a trained model, saving it to disk.

  22. Q: What is conda? How does it differ from pip?

    • A: Conda is a package and environment manager that can handle packages for any language. Pip is a Python-specific package manager.

  23. Q: What is the role of a conda.yml file?

    • A: It's a YAML file that lists the dependencies for a conda environment, including both Python and non-Python packages.

  24. Q: How does a conda environment help with reproducibility?

    • A: It can manage and reproduce an environment with specific versions of all packages, including system-level libraries.

  25. Q: What is the difference between an environment.yml and a requirements.txt?

    • A: An environment.yml is used by conda and can specify non-Python dependencies. requirements.txt is for pip and only handles Python packages.

  26. Q: What is the benefit of using MLflow Models for packaging?

    • A: MLflow Models provides a standard format for packaging models from any ML library, including metadata and environment details.

  27. Q: How does MLflow Models ensure reproducibility?

    • A: It saves the model along with all the environment information (e.g., conda.yml) and metadata, making it easy to deploy.

  28. Q: What is a "Python wheel"?

    • A: A Python wheel is a built-package format for Python that simplifies installation, often used for distributing libraries.

  29. Q: What are some other tools for environment management besides Docker and Conda?

    • A: Pipenv, Poetry, and virtualenv.

  30. Q: What is the role of a container registry (e.g., Docker Hub, AWS ECR)?

    • A: A container registry is a centralized repository for storing and managing Docker images.

  31. Q: How do you push a Docker image to a registry?

    • A: docker push my-registry/my-model-app:latest

  32. Q: What is the difference between a base image and a final image?

    • A: The base image is the starting point for your Dockerfile (e.g., python:3.9-slim). The final image is the complete image after all instructions have been executed.

  33. Q: What is a "layer" in a Docker image?

    • A: Each instruction in a Dockerfile creates a new, read-only layer. Docker caches these layers to speed up future builds.

  34. Q: Why is caching layers important in Docker?

    • A: It avoids rebuilding the entire image from scratch, which saves time, especially when only a small part of the Dockerfile has changed.

  35. Q: How do you version a Docker image?

    • A: By adding a tag to the image name, such as my-model-app:1.0 or my-model-app:latest.

  36. Q: What is the role of a docker-compose.yml file?

    • A: A docker-compose.yml file is used to define and manage multi-container Docker applications, for example, a model container and a database container.

  37. Q: Why is it a good practice to use a non-root user in a Docker container?

    • A: It's a security best practice to prevent the application from having administrative privileges inside the container.

  38. Q: What is the main benefit of using a containerized environment for model training?

    • A: It ensures that the training environment is identical to the production environment, eliminating inconsistencies.

  39. Q: What is the purpose of pip freeze > requirements.txt?

    • A: This command generates a requirements.txt file that lists all the packages installed in the current environment with their exact versions.

  40. Q: What are the two main types of reproducibility you need to consider in MLOps?

    • A: Training reproducibility (getting the same model weights from the same code and data) and inference reproducibility (getting the same predictions from the same input on the same model).


Model Deployment & Serving (Q41-Q80) πŸš€

  1. Q: What is "model deployment"?

    • A: Model deployment is the process of making a trained machine learning model available to a business application or end-users.

  2. Q: What are the two main types of deployment strategies?

    • A: Online/Real-time deployment and Offline/Batch deployment.

  3. Q: What is "online inference"?

    • A: Online inference is when predictions are generated in real-time for individual requests with low latency requirements (e.g., a recommendation engine).

  4. Q: What is "offline inference"?

    • A: Offline inference is when predictions are generated for a large batch of data at once, typically on a schedule, without strict latency requirements.

  5. Q: What is the role of a REST API in model deployment?

    • A: A REST API provides a standardized way for an application to send data to the model and receive predictions as a response.

  6. Q: Name two common model serving frameworks.

    • A: Flask and FastAPI are common Python frameworks for creating REST APIs to serve models.

  7. Q: What is the benefit of using FastAPI for model serving?

    • A: FastAPI is a modern, fast web framework that automatically generates interactive API documentation.

  8. Q: What is a "serverless" deployment?

    • A: A serverless deployment allows you to run your model's code without managing the underlying servers, scaling automatically based on traffic.

  9. Q: What are some advantages of a serverless approach?

    • A: It's highly scalable, cost-effective (you only pay for what you use), and reduces operational overhead.

  10. Q: What is the purpose of an "inference endpoint"?

    • A: An inference endpoint is a network address where a client can send a request to get a prediction from a deployed model.

  11. Q: What is "Kubernetes"?

    • A: Kubernetes is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications.

  12. Q: How does Kubernetes help with model deployment?

    • A: It helps manage a large number of model containers, handles scaling, load balancing, and ensures high availability.

  13. Q: What is a "rolling update" deployment?

    • A: A rolling update gradually replaces old model containers with new ones, ensuring continuous availability.

  14. Q: What is a "canary deployment"?

    • A: Canary deployment is a strategy where a new model version is deployed to a small subset of users (the "canary") to test its performance and stability before a full rollout.

  15. Q: What is a "blue-green deployment"?

    • A: A blue-green deployment involves running two identical environments: "blue" (the old version) and "green" (the new version). All traffic is switched from blue to green at once.

  16. Q: When would you use a canary deployment?

    • A: When you want to test a new model with real-world traffic to detect potential bugs or performance issues before a full release.

  17. Q: What are some challenges of real-time model serving?

    • A: Challenges include ensuring low latency, handling high traffic, and managing the cost of infrastructure.

  18. Q: What is the purpose of a "load balancer"?

    • A: A load balancer distributes incoming traffic across multiple instances of your model, preventing any single instance from becoming a bottleneck.

  19. Q: How does a "model registry" fit into the deployment process?

    • A: After a model is trained and versioned, a model registry is used to transition it to a "staging" or "production" stage, from where it can be deployed.

  20. Q: What is "CI/CD for ML" (CI/CD/CT)?

    • A: A system that automates the entire ML lifecycle, including continuous integration of code, continuous delivery/deployment of models, and continuous training (retraining) of models.

  21. Q: What is "A/B Testing" in deployment?

    • A: A/B Testing is a strategy where traffic is split between two different versions of a model to determine which one performs better on a specific metric.

  22. Q: What is the main difference between "serverless" and "containerized" deployment?

    • A: Serverless abstracts away the server management, while containerized deployment gives you more control over the infrastructure using tools like Docker and Kubernetes.

  23. Q: What is "model serving on the edge"?

    • A: Model serving on the edge involves deploying models to devices closer to the data source (e.g., mobile phones, IoT devices) to reduce latency and save bandwidth.

  24. Q: Why is model latency a critical metric for online serving?

    • A: Low latency is crucial for a good user experience in real-time applications like fraud detection or recommendation systems.

  25. Q: What are some key metrics to monitor for a deployed model?

    • A: Model performance (accuracy, F1-score), data drift, prediction latency, and resource usage (CPU/memory).

  26. Q: How can you scale a deployed model to handle more traffic?

    • A: By horizontally scaling, which means running more instances of the model behind a load balancer.

  27. Q: What is the purpose of "Model Monitoring"?

    • A: Model Monitoring involves tracking the performance and behavior of a model in production to detect issues like data drift or performance degradation.

  28. Q: How do you handle a "rollback" of a deployed model?

    • A: You can use your deployment pipeline to quickly revert to a previous, stable version of the model from the model registry.

  29. Q: What is the difference between "model serving" and "inference"?

    • A: Model serving is the act of providing the model to an external client. Inference is the process of getting a prediction from the model.

  30. Q: Why is it important to use a dedicated model serving framework (e.g., TensorFlow Serving) instead of a general-purpose web framework (e.g., Flask)?

    • A: Dedicated frameworks are optimized for inference, often supporting features like batching, concurrency, and model version management out of the box.

  31. Q: How does a REST API handle input and output data for a model?

    • A: It typically uses a structured format like JSON to send the input features and receive the prediction as a response.

  32. Q: What is a "serverless function"?

    • A: A serverless function is a piece of code that runs in response to an event, which can be used to serve a model for a single prediction request.

  33. Q: What is the purpose of a "health check endpoint"?

    • A: A health check endpoint is an API endpoint that reports the status of the deployed model, indicating if it's ready to serve requests.

  34. Q: What is the role of logging in a deployed model?

    • A: Logging is used to record events and data points, which is crucial for monitoring, debugging, and auditing the model in production.

  35. Q: How does a "model gateway" work?

    • A: A model gateway is an API gateway that sits in front of your models, providing a single entry point for all clients.

  36. Q: What is "batch processing" and why is it used for some ML tasks?

    • A: Batch processing is for processing large amounts of data at once. It's used when there's no need for real-time predictions, and it's more cost-effective.

  37. Q: How does "GitOps" apply to MLOps deployment?

    • A: GitOps uses Git as the single source of truth for the desired state of the deployment environment, automating changes through commits and pull requests.

  38. Q: What is the purpose of a "payload" in an API request to a model?

    • A: The payload is the body of the request, which contains the input data for the model to generate a prediction.

  39. Q: How does "autoscaling" work in model deployment?

    • A: Autoscaling automatically adjusts the number of model instances based on predefined metrics like CPU usage or incoming request volume.

  40. Q: What is the difference between "horizontal" and "vertical" scaling?

    • A: Horizontal scaling adds more machines. Vertical scaling adds more resources (CPU, RAM) to a single machine.


Advanced Concepts & Best Practices (Q81-Q100) πŸ“ˆ

  1. Q: What is "Model Drift Detection"?

    • A: The process of automatically identifying when a model's performance has degraded due to changes in the data or the relationship between variables.

  2. Q: What is "Data Drift"?

    • A: A change in the statistical properties of the input data used by the model.

  3. Q: What is "Concept Drift"?

    • A: A change in the relationship between the input features and the target variable.

  4. Q: How can you monitor for data drift in production?

    • A: By comparing the statistical properties of the production data with the training data.

  5. Q: What is the purpose of a "Feature Store" in a deployed system?

    • A: A Feature Store ensures that the features used for online inference are calculated and served in the same way they were for offline training.

  6. Q: Why is "observability" important in MLOps?

    • A: Observability provides a holistic view of the system's health, allowing you to understand its internal state and debug issues proactively.

  7. Q: What is the role of a "Data Validation" step in an MLOps pipeline?

    • A: It ensures the integrity of the data, checking for schema and statistical anomalies before a model is trained or used for inference.

  8. Q: How does "continuous training" work?

    • A: Continuous training is a process that automatically retrains a model based on new data or a schedule, and then pushes the new version to the registry.

  9. Q: What are the main components of a "real-time inference stack"?

    • A: A web server, a load balancer, a container orchestration tool (Kubernetes), and a monitoring system.

  10. Q: What is a "serverless model endpoint"?

    • A: A serverless model endpoint is an API endpoint for a model that scales automatically and only charges you for the time it is actively processing requests.

  11. Q: What is the benefit of "microservices architecture" for model deployment?

    • A: It allows you to deploy and scale different parts of your application independently, which can be useful for complex ML systems.

  12. Q: How do you ensure model security during deployment?

    • A: By encrypting data, using secure authentication, and running models in isolated environments (e.g., containers).

  13. Q: What is the purpose of a "prediction log"?

    • A: A prediction log records every request and response, which is crucial for debugging, auditing, and retraining a model.

  14. Q: How does a "feature store" help with online inference latency?

    • A: By pre-computing and serving features at low latency, it reduces the time it takes to get data for a prediction.

  15. Q: What is the difference between "model staging" and "model serving"?

    • A: Staging is the phase where a model is approved and managed. Serving is the act of making it available for predictions.

  16. Q: What is the main advantage of using a dedicated "ML Platform" (e.g., SageMaker)?

    • A: It provides a comprehensive, integrated environment for the entire ML lifecycle, reducing the need to stitch together multiple tools.

  17. Q: What is "continuous monitoring"?

    • A: The practice of continuously tracking and analyzing the performance of a deployed model.

  18. Q: Why is it important to test your model at scale before deployment?

    • A: To ensure that the model and its infrastructure can handle the expected production traffic without performance degradation.

  19. Q: How does "Continuous Delivery" apply to MLOps?

    • A: It means that every change to the model or its code is automatically built, tested, and ready to be deployed to a production-like environment.

  20. Q: How does MLOps help with cost optimization? * A: By automating the entire process and using scalable infrastructure, MLOps reduces manual effort and optimizes resource usage.

Comments

Popular posts from this blog

Resume Work and Project Details

Time Series and MMM basics

LINEAR REGRESSION