-
University of Cambridge
- Cambridge, UK
- lorenzopacchiardi.me/
- https://orcid.org/0000-0003-4760-7638
Highlights
- Pro
Block or Report
Block or report LoryPack
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Acceptance rates for the major AI conferences
Website
SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. SmartPlay is designed to be easy to use, and to support futu…
Jekyll version of the newest Agency Bootstrap theme, plus new features: Google Analytics, Markdown support, custom pages, and more!
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
🚵 Landing Pages of Ant Design System
A free React / Next.js landing page template designed to showcase open source projects, SaaS products, online services, and more. Made by
An extensible benchmark for evaluating large language models on planning
Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.
Forecasting Future World Events with Neural Networks (NeurIPS 2022)
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image …
A Python library for variational inference with normalizing flow and annealing
Code and data for the paper Revealing the structure of language model capabilities
Scientific Inkscape: Inkscape extensions for figure resizing and editing
Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"
🐢 Open-Source Evaluation & Testing for LLMs and ML models
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Specify what you want it to build, the AI asks for clarification, and then builds it.
Likelihood-free AMortized Posterior Estimation with PyTorch
A domain-specific probabilistic programming language for modeling and inference with language models
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
📚 A curated list of papers & technical articles on AI Quality & Safety