Custom Model Training & Deployment on GCP

🚀 Custom Model Training & Deployment in GCP (Hindi Guide)

Custom Model Training & Deployment Google Cloud Platform (GCP) का एक powerful feature है जो आपको अपने custom ML code को scalable infrastructure पर run करने देता है। GCP का Vertex AI service आपको Training Jobs, Model Registry, Deployment endpoints और Auto-scaling जैसी capabilities देता है। इस blog में हम step-by-step देखेंगे कि custom ML model को GCP पर कैसे train और deploy किया जाए।

🔹 Custom Training vs Pre-built Training

Pre-built Training: Predefined algorithms और AutoML tools available होते हैं।
Custom Training: जब आपको अपने ML code (TensorFlow, PyTorch, Scikit-learn, XGBoost) को run करना होता है।
Custom training आपको maximum flexibility देता है, खासकर जब आपका use case unique हो।

🔹 Custom Model Training के लिए Requirements

Google Cloud Project setup (Billing enabled)
Dataset GCS (Google Cloud Storage) या BigQuery में available होना चाहिए
Training Code (Python, TensorFlow/PyTorch etc.)
Container/Docker Image (optional, अगर आपको custom environment चाहिए)
IAM permissions (Vertex AI User, Storage Admin)

🔹 Step 1: Dataset Prepare करें

# Example: GCS bucket से CSV load करना
import pandas as pd
df = pd.read_csv("gs://my-bucket/datasets/train.csv")
print(df.head())

🔹 Step 2: Training Script लिखें

# train.py
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
import joblib

df = pd.read_csv("gs://my-bucket/datasets/train.csv")
X = df.drop("target", axis=1)
y = df["target"]

model = RandomForestClassifier()
model.fit(X, y)

# Model save करना
joblib.dump(model, "model.joblib")

🔹 Step 3: Vertex AI पर Training Job Run करें

आप gcloud CLI या Python SDK से training job submit कर सकते हैं।

# Using gcloud CLI
gcloud ai custom-jobs create 
  --region=us-central1 
  --display-name=custom-train-job 
  --worker-pool-spec=machine-type=n1-standard-4,replica-count=1,container-image-uri=gcr.io/my-project/train-image

🔹 Step 4: Model को Vertex AI Registry में Save करें

# Model upload using gcloud
gcloud ai models upload 
  --region=us-central1 
  --display-name=custom-ml-model 
  --artifact-uri=gs://my-bucket/trained-model/ 
  --container-image-uri=gcr.io/cloud-aiplatform/prediction/sklearn-cpu.1-0

🔹 Step 5: Model Deployment (Online Prediction)

# Endpoint create करना
gcloud ai endpoints create --region=us-central1 --display-name=my-endpoint

# Model deploy करना
gcloud ai endpoints deploy-model ENDPOINT_ID 
  --region=us-central1 
  --model=MODEL_ID 
  --machine-type=n1-standard-4

⚡ Scaling Options

Auto-scaling: Endpoint automatically traffic के हिसाब से scale होता है।
Multi-model deployment: एक endpoint पर multiple models deploy किए जा सकते हैं।
A/B Testing: अलग-अलग versions को serve कर सकते हैं।

🔒 Security & IAM Best Practices

Minimum IAM roles assign करें (Vertex AI User, Storage Viewer)
Service Accounts का use करें, ना कि user credentials
Model artifacts को GCS में bucket-level IAM और encryption के साथ रखें
Endpoints को VPC Service Controls के अंदर deploy करें
Logging और Monitoring के लिए Cloud Logging और Cloud Monitoring enable करें

⚡ Real-Life Example: Fraud Detection

मान लीजिए आप Fraud Detection ML Model बना रहे हैं। - Data GCS में store है - आप Custom Training Job run करके RandomForest/Deep Learning model train करते हैं - Model Vertex AI Registry में save होता है - Endpoint पर deploy करके real-time transaction fraud prediction करते हैं - Cloud Monitoring latency और accuracy track करता है इस तरह GCP आपको एक production-ready ML pipeline देता है।

🏆 निष्कर्ष

Vertex AI Custom Model Training & Deployment ML developers को flexibility और scalability दोनों देता है। चाहे आपका ML model TensorFlow, PyTorch, या Scikit-learn में हो, आप उसे आसानी से GCP पर train और deploy कर सकते हैं। अगर आपको enterprise-level ML infrastructure चाहिए तो GCP का Vertex AI best option है।