SKILLS

Technical Arsenal

From GPU cluster provisioning to LLM inference optimization — the full stack of modern AI engineering.

GPU_INFRASTRUCTURE & HPC
High-performance compute & hardware
NVIDIA DGX Systems A100 / H100 GPUs GPU-Accelerated Computing Distributed Training CUDA Workloads NVIDIA Jetson Raspberry Pi OpenCV OAK-D
🧠
AI_MODEL_INFERENCE & OPTIMIZATION
Serving, scaling & performance tuning
vLLM Triton Inference Server Inference Optimization Quantization (INT4/INT8) Tensor Parallelism Continuous Batching Model Scaling Ollama
KUBERNETES & MLOPS
Cluster management & deployment pipelines
Kubernetes (K8s) Kubeadm Terraform BCM Cluster Mgmt Docker Kubeflow CI/CD Pipelines CKAD Certified SageMaker Model Registry
AI_FRAMEWORKS & LLMs
Foundation models, RAG & multi-agent systems
PyTorch HuggingFace Transformers LangChain LlamaIndex CrewAI TensorFlow OpenAI SDK YOLO OpenCV Tesseract Stable Diffusion
CLOUD & BACKEND
Infrastructure, APIs & data systems
AWS GCP Azure FastAPI Django Python SQL PostgreSQL MongoDB Firebase Redis Vector DBs Kafka
🔧
TOOLS & WORKFLOW
Dev toolchain & automation
Git / GitHub Jupyter VSCode Roboflow CVAT Selenium BeautifulSoup Crawl4AI Ollama
PROFICIENCY_OVERVIEW
LLM / GenAI
92%
MLOps / K8s
88%
GPU / HPC
84%
Computer Vision
80%
Cloud / AWS
78%
Backend APIs
75%