Hemang Manish Shah

Software Development Engineer II at Amazon.com

https://www.linkedin.com/in/hshah20/

FELLOW MEMBER

Hemang Manish Shah’s career reflects a professional pattern seen in a select class of engineers: those whose work not only solves difficult technical problems, but also expands what organizations believe is technically possible. As a Senior Software Engineer at Zoom Video Communications’ AI Companion team, following impactful experience as a Senior Software Development Engineer at Amazon, Shah has consistently operated at the intersection of machine learning, large-scale distributed systems, cloud infrastructure, and intelligent automation. Across both workplace platforms and global commerce ecosystems, his work has focused on building systems that connect enterprise knowledge, detect abuse, protect digital trust, and scale applied AI into production with measurable impact.

His technical foundation is unusually broad and deeply integrated. Shah’s work draws from TensorFlow, PyTorch, large language models, natural language processing, computer vision, AWS Inferentia optimization, SageMaker, ECS Fargate, Lambda, Step Functions, DynamoDB, OpenSearch, Temporal workflows, Go, Python, Java, and React. This is not a collection of isolated tools, but a coherent engineering stack deployed in service of difficult, high-throughput, real-world systems. His professional trajectory shows repeated success in combining these capabilities into architectures that improve scale, efficiency, and intelligence across enterprise platforms.

At Zoom, Shah demonstrated this capability through the re-architecture of a large-scale enterprise data connector pipeline designed to synchronize knowledge into retrieval systems. The original system supported approximately 20,000 documents per day. Recognizing that the market standard operated at several times that level, he redesigned the architecture to parallelize processing across independent data sources, move filtering earlier in the workflow, and introduce fire-and-forget event handling so downstream services could elastically scale against queue backlogs. Within workflow activities, he applied Go-routine-based parallelization, reducing latency from roughly 40–50 seconds to about 4 seconds. Delivered within a two-week development window, the redesigned system achieved throughput of approximately 700,000 documents per day, an 8–9x improvement over existing market benchmarks, substantially advancing enterprise retrieval infrastructure performance.

At Amazon, Shah’s most comprehensive demonstration of technical leadership came through PRISM, a continuous intellectual property infringement discovery system built on machine learning-based semantic search. As technical lead for a cross-functional team of three engineers, he architected the system from concept through production. After extensive experimentation, he selected and fine-tuned a CLIP-based model, achieving 2x higher recall and 3x higher throughput than competing candidates. He optimized the model for AWS Inferentia2, reaching 3,500 transactions per second while reducing costs by 33%. He also engineered a discovery pipeline capable of ingesting risk signals and performing KNN searches with p90 latency of 200 milliseconds across a 27 billion-vector OpenSearch index. Through inverted file KNN and product quantization, he reduced index memory from 7,032 GB to 586 GB, a 12x reduction. The resulting platform enabled the removal of roughly 22,000 counterfeit products per day, saved approximately 1,000 man-hours daily, and produced projected annual savings of $6.75 million.

His work also extended into optimizing human audit workflows for intellectual property enforcement. In that environment, associates had previously been required to review every listing manually. Shah integrated the BLIP2 multimodal vision-language model to automate filtering, reducing unnecessary audits by 80% and increasing human audit yield fourfold. He then led a transition to InstructBLIP, redesigning the inference architecture with distributed sharding and CPU offloading, and enabling batch processing of 800 images per request—an 80x throughput increase. These changes increased daily processing capacity from 20,000 products to more than 1.6 million products, illustrating his ability to combine cutting-edge model selection with production-grade systems engineering.

Another major dimension of Shah’s work has been image recognition and brand protection at scale. In the YOLOv4-based Brand Logo Detection System, he helped create an object detection and embedding pipeline that processed millions of product images, localized potential brand logos, and matched them against an internal repository through k-nearest-neighbor search. This system enabled the automated removal of approximately 650,000 unauthorized product listings per month. In a related Image Similarity Search platform, he used a fine-tuned MoCoV2 ResNet model to generate embeddings across a search space of 8 billion product images, with nmslib HNSW enabling sub-second search performance. That platform empowered brands to identify unauthorized image use and contributed to the removal of approximately 28,000 products per day involving illegal copyright use.

The foundation of Shah’s Amazon work was established even earlier through a high-performance NLP-based Rule Execution System for IP infringement detection. This system processed 17,500 transactions per second across 1.5 billion daily product updates, evaluating each against more than 350,000 complex rules in real time. He incorporated custom Lucene-based indexing with language-specific analyzers to counter brand and trademark obfuscation, and developed a query generation approach that reduced the number of rules requiring full evaluation. The resulting rule engine enabled the removal of approximately 250,000 illegal products per day, with processing completed in under 250 milliseconds per product update. He further extended this work through a Keyword Query Automation System that saved 1,575 person-hours per month and identified approximately 15,000 infringing products annually.

What distinguishes Shah’s profile is not only the technical sophistication of these systems, but the repeated pattern of converting advanced machine learning and distributed systems concepts into operational platforms with substantial human, economic, and organizational benefit. His work spans enterprise retrieval, semantic search, multimodal AI, image recognition, trust and safety, and performance engineering, yet across all of these domains the through-line is consistent: building architectures that scale intelligence into practical, high-impact use.

For IICSPA Fellowship consideration, Hemang Manish Shah stands out as a technologist whose work combines original engineering design, advanced applied AI, enterprise-scale systems architecture, and measurable impact across large organizations. His record reflects the level of technical leadership, innovation, and field-shaping contribution that is well aligned with fellowship-level recognition.