muhammad-mujtaba-ai commited on
Commit
b371fab
·
verified ·
1 Parent(s): 32882f1

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -61,6 +61,8 @@ On the following parameters
61
  - OpenZeppelin Compliance(%)--Adherence to OpenZeppelin library usage and standards.
62
  - Gas Efficiency(%)--Degree of gas optimization based on Slither’s suggestions.
63
  - Security(%)--Percentage of code free from common vulnerabilities detected by Slither.
 
 
64
 
65
  ## Benchmark
66
  Below is a figure summarizing the performance of each model across the four evaluation metrics.
@@ -275,7 +277,7 @@ contract DecentralizedLibrary is Ownable(msg.sender) {
275
  # Evaluation Matrics
276
  To evaluate the performance of our fine-tuned LLM specialized in Solidity smart contract generation, we used **[Slither](https://github.com/crytic/slither)**, a static analysis framework widely used for analyzing Solidity code.
277
 
278
- We focused on four key evaluation criteria:
279
 
280
  - **Compilation Success Rate**
281
  We measured the percentage of generated smart contracts that compile successfully without modification. This helps assess the syntactic and structural correctness of the model outputs.
@@ -289,8 +291,16 @@ Using Slither’s gas optimization analysis, we identified areas in the generate
289
  - **Security Vulnerabilities**
290
  We analyzed each contract for known security vulnerabilities using Slither’s built-in detectors. We recorded the number and severity of the vulnerabilities detected, providing a measure of the security quality of the model’s outputs.
291
 
 
 
 
 
 
 
292
  These evaluation metrics help quantify the practical usability and reliability of the generated smart contracts in real-world scenarios.
293
 
294
 
 
 
295
  # Summary
296
  Model shows improved understanding and generation capabilities in Solidity when compared to baseline LLMs not trained on Solidity data.
 
61
  - OpenZeppelin Compliance(%)--Adherence to OpenZeppelin library usage and standards.
62
  - Gas Efficiency(%)--Degree of gas optimization based on Slither’s suggestions.
63
  - Security(%)--Percentage of code free from common vulnerabilities detected by Slither.
64
+ - Average Lines of Code--How lengthy, complete,
65
+ - Correctness of Code--
66
 
67
  ## Benchmark
68
  Below is a figure summarizing the performance of each model across the four evaluation metrics.
 
277
  # Evaluation Matrics
278
  To evaluate the performance of our fine-tuned LLM specialized in Solidity smart contract generation, we used **[Slither](https://github.com/crytic/slither)**, a static analysis framework widely used for analyzing Solidity code.
279
 
280
+ We focused on six key evaluation criteria:
281
 
282
  - **Compilation Success Rate**
283
  We measured the percentage of generated smart contracts that compile successfully without modification. This helps assess the syntactic and structural correctness of the model outputs.
 
291
  - **Security Vulnerabilities**
292
  We analyzed each contract for known security vulnerabilities using Slither’s built-in detectors. We recorded the number and severity of the vulnerabilities detected, providing a measure of the security quality of the model’s outputs.
293
 
294
+ - **Average Lines of Code**
295
+ This metric provides insight into the verbosity or conciseness of the model’s output. Higher LOC may suggest redundancy or complete code, while lower LOC could indicate either efficiency or missing implementation details, depending on context.
296
+
297
+ - **Correctness of Code**
298
+ To assess how well the generated code aligns with the given prompt and category, We conducted both manual and OpenAI LLM evaluation of each generated contract. The prompt and the generated code were keenly observed for alignment analysis.
299
+
300
  These evaluation metrics help quantify the practical usability and reliability of the generated smart contracts in real-world scenarios.
301
 
302
 
303
+
304
+
305
  # Summary
306
  Model shows improved understanding and generation capabilities in Solidity when compared to baseline LLMs not trained on Solidity data.