Spaces:

cc4718
/

FailureSensorIQ

Running

[email protected] commited on Apr 14

Commit

c40ac63

1 Parent(s): d4d8b2d

update

Files changed (2) hide show

app.py CHANGED Viewed

@@ -99,6 +99,16 @@ with demo:
         with gr.TabItem("🏅 LLM Benchmark", elem_id="llm-benchmark-tab-table", id=0):
             leaderboard = init_leaderboard(LEADERBOARD_DF)
         with gr.TabItem("📝 About", elem_id="llm-benchmark-tab-table", id=2):
             gr.Markdown(LLM_BENCHMARKS_TEXT, elem_classes="markdown-text")

         with gr.TabItem("🏅 LLM Benchmark", elem_id="llm-benchmark-tab-table", id=0):
             leaderboard = init_leaderboard(LEADERBOARD_DF)
+        with gr.TabItem("📊 Performance Plot", elem_id="llm-benchmark-tab-table", id=1):
+            gr.Markdown(LLM_BENCHMARKS_TEXT, elem_classes="markdown-text")
+        print(LEADERBOARD_DF)
+            # with gr.Row():
+            #     bs_1_plot = gr.components.Plot(
+            #         value=plot_throughput(LEADERBOARD_DF, bs=1),
+            #         elem_id="bs1-plot",
+            #         show_label=False,
+            #     )
         with gr.TabItem("📝 About", elem_id="llm-benchmark-tab-table", id=2):
             gr.Markdown(LLM_BENCHMARKS_TEXT, elem_classes="markdown-text")

src/about.py CHANGED Viewed

@@ -33,7 +33,7 @@ Intro text
 # Which evaluations are you running? how can people reproduce what you have?
 LLM_BENCHMARKS_TEXT = '''
-## Test Prompt
 The prompt will follow the following style. Models' output are expected to follow this format.
 ```
 Select the correct option(s) from the following options given the question. To solve the problem, follow the Let's think Step by Step reasoning strategy.
@@ -47,6 +47,10 @@ E voltage
 {"step_1": "<Step 1 of your reasoning>", "step_2": "<Step 2 of your reasoning>", "step_n": "<Step n of your reasoning>", "answer": <the list of selected option, e.g., ["A", "B", "C", "D", "E"]>}
 Your output in a single line:
 ```
 ## Reproducibility
 To reproduce our results, here is the commands you can run:

 # Which evaluations are you running? how can people reproduce what you have?
 LLM_BENCHMARKS_TEXT = '''
+## Prompt Format
 The prompt will follow the following style. Models' output are expected to follow this format.
 ```
 Select the correct option(s) from the following options given the question. To solve the problem, follow the Let's think Step by Step reasoning strategy.
 {"step_1": "<Step 1 of your reasoning>", "step_2": "<Step 2 of your reasoning>", "step_n": "<Step n of your reasoning>", "answer": <the list of selected option, e.g., ["A", "B", "C", "D", "E"]>}
 Your output in a single line:
 ```
+## Expected Output Format
+```
+{"step_1": "<Step 1 of your reasoning>", "step_2": "<Step 2 of your reasoning>", "step_n": "<Step n of your reasoning>", "answer": <the list of selected option, e.g., ["A", "B", "C", "D", "E"]>}
+```
 ## Reproducibility
 To reproduce our results, here is the commands you can run: