@bhabej
This developer is so mysterious, even their code has commitment issues.
Team members: Bhabishya Gurung (me), Sakhi Hashmat Khalil, Kiran Thapalia Overview This project addresses the challenge of identifying and aggregating duplicate innovations described by different organizations. It was developed as a hackathon submission for VTT, focusing on semantic AI and large language models (LLMs) to resolve ambiguity and unify innovation records. Approach Data Integration: Merged structured innovation relationship data from company websites and VTT domain pages. Feature Extraction: For each innovation, extracted textual descriptions, full source documents, organization names, and source URLs. Semantic Similarity: Used AI-based semantic similarity (likely leveraging embeddings and LLMs) to detect potential duplicates by comparing innovation descriptions. Clustering & Aggregation: Grouped similar innovations into clusters and generated unified summaries for each cluster, ensuring source and contributor information is preserved[1]. Technologies Used Python (Jupyter Notebook) Semantic AI (embeddings, LLMs) Data processing with pandas
View ProjectSkills include: Turning coffee into code, debugging by staring intensely at the screen, and mastering the art of Stack Overflow copy-paste.
Social links? Pfft. I communicate exclusively via binary smoke signals.