I specialize in end-to-end Computer Vision, Machine Learning, and AI model development. I will do it all for you; from dataset collection, annotation and model training to real-world deployment.
What I Offer
- Object Detection (YOLOv11, Detectron2, Dinov3, RF-DETR)
- Object Tracking (ByteTrack, DeepSORT)
- Image Segmentation (Mask-RCNN, U-NET, SAM2, YOLO-Seg)
- Pose Estimation (ViTPose, YOLO-Pose)
- Depth Estimation, Feature extraction (keypoint detection)
- Face Recognition (FaceNet, DeepFace, Dlib)
- OCR Solutions for images, PDFs, and scanned docs (Tesseract, PaddleOCR, Azure Document Intelligence, AWS Textract)
- GANs, Image Generation
- Data collection, annotation
- Image Captioning (Florence2, CLIP)
- Real-time video processing, live video analysis
- ONNX & TensorRT conversion for fast inference
- Deployment on AWS, GCP, Android, iOS, Raspberry Pi, Edge Devices
Tools, platforms and frameworks:
- Pytorch, Tensorflow, Keras, scikit-learn, HuggingFace
- AWS, GCP, Azure
- Docker containerization
- Vector DBs; Chroma. Pinecone
- Jupyter Notebook, Colab
- Databases: MySQL, Postgres
Deliverables include trained models, clean datasets, documentation, and setup guides.