Chain-GPT
/

Solidity-LLM

Text Generation

Smart Contracts

Code Generation

Model card Files Files and versions

muhammad-mujtaba-ai commited on Jun 5

Commit

4c44977

·

verified ·

1 Parent(s): f184350

Upload folder using huggingface_hub

Files changed (1) hide show

README.md +0 -9

README.md CHANGED Viewed

@@ -62,8 +62,6 @@ On the following parameters
 - **Gas Efficiency(%)** - Degree of gas optimization based on Slither’s suggestions.
 - **Security(%)** - Percentage of code free from common vulnerabilities detected by Slither.
 - **Average Lines of Code** - Average number of non-empty, commented-included lines in generated contracts, indicating verbosity or conciseness
-- **Correctness (OpenAI Evaluation)** – GPT-4o Mini-assessed alignment of generated code with prompt using a structured correctness rubric.
-- **Correctness (Human Evaluation)** – Expert-reviewed rating of how well the generated contract fulfills the original prompt and intent.
 ## Benchmark
 Below is a figure summarizing the performance of each model across the four evaluation metrics.
@@ -295,13 +293,6 @@ We analyzed each contract for known security vulnerabilities using Slither’s b
 - **Average Lines of Code (LOC)**
 Captures the average number of lines per generated contract, excluding blank lines but including comments. This metric reflects code verbosity or conciseness, and helps gauge implementation completeness versus potential redundancy.
-- **Correctness (OpenAI Evaluation)**
-Evaluates how accurately the generated contract matches the intended prompt using GPT-4o Mini. Prompts and outputs are scored against a structured rubric, providing a scalable LLM-based perspective on prompt alignment.
-- **Correctness (Human Evaluation)**
-Involves manual review by a blockchain expert to assess how well the output satisfies the original prompt and category. This provides human-validated insight into the practical applicability and quality of the generated code.
 These metrics collectively provide a multi-dimensional view of the model’s effectiveness, spanning correctness, efficiency, security, and usability. They are designed to reflect both automated benchmarks and real-world developer expectations.

 - **Gas Efficiency(%)** - Degree of gas optimization based on Slither’s suggestions.
 - **Security(%)** - Percentage of code free from common vulnerabilities detected by Slither.
 - **Average Lines of Code** - Average number of non-empty, commented-included lines in generated contracts, indicating verbosity or conciseness
 ## Benchmark
 Below is a figure summarizing the performance of each model across the four evaluation metrics.
 - **Average Lines of Code (LOC)**
 Captures the average number of lines per generated contract, excluding blank lines but including comments. This metric reflects code verbosity or conciseness, and helps gauge implementation completeness versus potential redundancy.
 These metrics collectively provide a multi-dimensional view of the model’s effectiveness, spanning correctness, efficiency, security, and usability. They are designed to reflect both automated benchmarks and real-world developer expectations.