AI & Machine Learning Engineer

Anirudh Raj
Sharma.

Engineering production ML systems that ship.From 52GB data pipelines to real-time computer vision.

About me.

Graduate student pursuing a Master of Science in Artificial Intelligence at the University at Buffalo, SUNY. Deep focus on Reinforcement Learning, Machine Learning, and Computer Vision.

I build production-grade ML pipelines, optimize models for real-world performance, and deploy scalable AI systems. From architecting ETL pipelines processing 52GB+ datasets at IBM to engineering real-time network monitoring at InfraKnit Technologies.

MS

Artificial Intelligence

University at Buffalo

2

Industry Internships

IBM & InfraKnit

4+

ML Projects Shipped

Production-grade systems

Experience.

IBM

Software Engineer Intern

Jun 2022 – Aug 2022

  • Architected an end-to-end ETL pipeline using Python and Dask to process a 52GB dataset, accelerating the ML development lifecycle by 35%.
  • Drove an 18% uplift in model accuracy by implementing and tuning an XGBoost classifier, replacing an underperforming Random Forest baseline.
  • Containerized the application with Docker and configured deployment within a Jenkins CI/CD pipeline for zero-downtime releases.

InfraKnit Technologies

Network Engineering Intern

Aug 2023 – Oct 2023

  • Re-architected network protocols for public transit systems, improving critical alert delivery time by 20%.
  • Implemented preventive maintenance schedules that slashed critical network failures by 40% across 15+ nodes.
  • Created troubleshooting guides standardizing resolutions for 10+ common issues, cutting ticket resolution time by 30%.

Projects.

Data Engineering

CVE Security Data Lakehouse

Production-grade pipeline on Databricks processing 40K+ cybersecurity vulnerabilities. Bronze-Silver-Gold Medallion architecture with Delta Lake ACID transactions.

318K

Raw files

4,000+

Vendors

1,816

Critical CVEs

PySparkDelta LakeDatabricksSQL

AI Agent

SQL-Grounded Analytics Agent

Production agent answering business questions via safe SQL generation. Schema-aware with dry-run validation, query budget caps, and PII-aware row limits.

48%

Fewer errors

280ms

p90 latency

100%

Validated

PythonLangChainOpenAI APISQLAlchemyPinecone

Computer Vision

Real-Time Product Detection

Fine-tuned YOLOv8 to identify 50+ product SKUs from retail shelf images. Deployed as REST API via Flask/Docker on GCP for real-time inventory tracking.

0.92

mAP score

50+

SKUs

400%

Augmentation

PyTorchYOLOv8FlaskDockerGCP

Social Network Analysis

Echo Chamber Analysis

Analyzed echo chamber formation on Reddit using network analysis. Led the MCP implementation enabling AI-powered analysis of community polarization.

MCP

Integration

NLP

Analysis

EAS 587

Research

PythonMCPNetwork AnalysisReddit API

Skills.

Languages & Frameworks

PythonC/C++JavaScriptDjangoNode.js

AI & Machine Learning

PyTorchTensorFlowLangChainHugging FaceLLMsXGBoostscikit-learnVector Databases

Data Engineering & Cloud

Apache SparkDaskETL/ELTdbtSnowflakeBigQueryAWSGCPAzurePostgreSQLMySQL

MLOps & DevOps

KubernetesDockerJenkinsTerraformMLflowW&BPrometheusAirflowKafkaGitFastAPI

Let's connect.

Open to new opportunities, collaborations, and interesting conversations about AI and ML.