Tests

Actions

Tests

Actions

Loading...
Loading

tests.yaml

187 workflow run results

udpated piqa (#222) Tests #877: Commit 506abda pushed by clefourrier

July 25, 2024 09:43

39m 5s main

main

July 25, 2024 09:43

39m 5s

fix (#233) Tests #873: Commit db502dd pushed by NathanHB

July 24, 2024 18:23

37m 54s main

main

July 24, 2024 18:23

37m 54s

Removes default bert scorer init (#234) Tests #867: Commit 2b4b637 pushed by NathanHB

July 24, 2024 12:43

37m 46s main

main

July 24, 2024 12:43

37m 46s

remove latex writer since we don't use it (#231) Tests #859: Commit 86fbe64 pushed by clefourrier

July 23, 2024 12:33

36m 32s main

main

July 23, 2024 12:33

36m 32s

Update issue templates (#235) Tests #854: Commit 003f05e pushed by NathanHB

July 23, 2024 10:19

38m 36s main

main

July 23, 2024 10:19

38m 36s

Fix a tiny bug in DROP metric (#229) Tests #848: Commit 66ed7a2 pushed by clefourrier

July 18, 2024 10:36

39m 1s main

main

July 18, 2024 10:36

39m 1s

Quantization related issues (#224) Tests #842: Commit 44f9a46 pushed by clefourrier

July 17, 2024 14:36

37m 59s main

main

July 17, 2024 14:36

37m 59s

Make evaluator invariant of input request type order (#215) Tests #838: Commit 951cd5b pushed by clefourrier

July 17, 2024 13:48

37m 46s main

main

July 17, 2024 13:48

37m 46s

Fix _init_max_length in base_model.py (#185) Tests #832: Commit d43c9a3 pushed by clefourrier

July 17, 2024 12:34

37m 13s main

main

July 17, 2024 12:34

37m 13s

launch lighteval using lighteval --args (#152) Tests #826: Commit 4550cb7 pushed by clefourrier

July 17, 2024 09:16

41m 41s main

main

July 17, 2024 09:16

41m 41s

Add metrics as functions (#214) Tests #821: Commit aaf7e8a pushed by clefourrier

July 17, 2024 08:07

36m 58s main

main

July 17, 2024 08:07

36m 58s

should fix most inference endpoints issues of version config (#226) Tests #816: Commit 733257f pushed by NathanHB

July 16, 2024 13:46

39m 4s main

main

July 16, 2024 13:46

39m 4s

Transformers model as Judge Tests #803: Pull request #174 synchronize by NathanHB

July 11, 2024 11:45

2m 36s anilaltuner:main

anilaltuner:main

July 11, 2024 11:45

2m 36s

Data split depending on eval params (#169) Tests #802: Commit 66e6aae pushed by NathanHB

July 11, 2024 11:16

37m 18s main

main

July 11, 2024 11:16

37m 18s

Now only uses functions for prompt definition (#213) Tests #792: Commit 4651531 pushed by clefourrier

July 9, 2024 13:29

38m 12s main

main

July 9, 2024 13:29

38m 12s

Use only dataclasses for task init (#212) Tests #789: Commit 3aaec22 pushed by clefourrier

July 9, 2024 12:42

36m 57s main

main

July 9, 2024 12:42

36m 57s

Fix a few typos in metrics.py (#218) Tests #787: Commit 3f90950 pushed by clefourrier

July 9, 2024 11:42

37m 26s main

main

July 9, 2024 11:42

37m 26s

Homogeneize logging system (#150) Tests #784: Commit ac57b78 pushed by clefourrier

July 9, 2024 10:13

39m 47s main

main

July 9, 2024 10:13

39m 47s

Adds a dummy/random model for baseline init (#220) Tests #778: Commit 70f7fc6 pushed by clefourrier

July 9, 2024 07:41

38m 58s main

main

July 9, 2024 07:41

38m 58s

Fix the bug (#216) Tests #763: Commit 0528f29 pushed by clefourrier

July 8, 2024 09:04

36m 36s main

main

July 8, 2024 09:04

36m 36s

[Bugfix] Avoid truncating the outputs based on string lengths (#201) Tests #761: Commit 6064695 pushed by clefourrier

July 8, 2024 06:38

38m 25s main

main

July 8, 2024 06:38

38m 25s

Fix a few typos and do a tiny refactor (#187) Tests #746: Commit 843a0f8 pushed by clefourrier

July 5, 2024 06:57

37m 39s main

main

July 5, 2024 06:57

37m 39s

Fix a few typos and do a tiny refactor Tests #745: Pull request #187 synchronize by sadra-barikbin

July 4, 2024 19:42

36m 45s sadra-barikbin:main

sadra-barikbin:main

July 4, 2024 19:42

36m 45s

ADD GPT-4 as Judge (#206) Tests #743: Commit 0bceaee pushed by clefourrier

July 4, 2024 14:38

36m 38s main

main

July 4, 2024 14:38

36m 38s

fix llm as judge warnings (#173) Tests #737: Commit 3a80833 pushed by clefourrier

July 4, 2024 10:48

36m 45s main

main

July 4, 2024 10:48

36m 45s

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actions

Workflows

Management

Tests

Actions

Loading...
Loading

Create status badge

Filter by Event

Sorry, something went wrong.

Sorry, something went wrong.

No matching events.

Filter by Status

Sorry, something went wrong.

Sorry, something went wrong.

No matching statuses.

Filter by Branch

Sorry, something went wrong.

Sorry, something went wrong.

No matching branches.

Filter by Actor

Sorry, something went wrong.

Sorry, something went wrong.

No matching users.

Actions: huggingface/lighteval

Actions

Tests Tests Actions Loading... Loading Sorry, something went wrong.

Tests

Tests

Actions

Loading...
Loading