Hacker Newsnew | past | comments | ask | show | jobs | submit | shahules's submissionslogin
1.Cloning Bench: Evaluating AI Agents on Visual Website Cloning (github.com/vibrantlabsai)
2 points by shahules 22 days ago | past | 1 comment
2.PA bench: Evaluating web agents on real world personal assistant workflows (vibrantlabs.com)
38 points by shahules 58 days ago | past | 9 comments
3.PA Bench: Evaluating Frontier Models on Multi-Tab Pa Tasks (vibrantlabs.com)
7 points by shahules 64 days ago | past | 1 comment
4.Show HN: Ragas – Open-source library for evaluating RAG pipelines (github.com/explodinggradients)
121 points by shahules on March 21, 2024 | past | 26 comments
5.Show HN: Ragas – Open-source library for evals and testing RAG systems (github.com/explodinggradients)
15 points by shahules on March 20, 2024 | past | 9 comments
6.Show HN: The rise of open source large language models (explodinggradients.com)
5 points by shahules on April 13, 2023 | past
7.Show HN: GPT4 vs. GPT3:What you should know (explodinggradients.com)
2 points by shahules on March 28, 2023 | past
8.Show HN: Open-source alternative to Adobe speech enhancer (github.com/shahules786)
3 points by shahules on Dec 20, 2022 | past

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: