muhammad-mujtaba-ai commited on
Commit
c407764
·
verified ·
1 Parent(s): 4d28c2d

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. Benchmark.png +0 -0
  2. README.md +32 -17
Benchmark.png CHANGED
README.md CHANGED
@@ -27,7 +27,7 @@ Solidity-Code-LLM is a fine tuned large language model designed to understand, g
27
  - **License:** MIT License
28
  - **Finetuned from model:** Salesforce/codegen-2B-multi
29
 
30
- ![Training Pipeline](logo.png)
31
 
32
 
33
  # Model Details
@@ -45,28 +45,34 @@ Solidity-Code-LLM is a specialized language model trained in two stages: pre-tra
45
  - **Dtype**: bfloat16
46
 
47
  ### Model Sources
48
- For more details, please refer to,
49
- - **Paper [optional]:** {{ paper | default("[More Information Needed]", true)}}
50
  - **Demo:** [Demo On Hugging Face Space](https://huggingface.co/spaces/Chain-GPT/SolidityLLMDemo)
51
 
52
-
53
  # Model Comparison
54
  We have compared our model with the following models
55
- - Qwen/CodeQwen1.5-7B
56
- - deepseek-ai/deepseek-coder-1.3b-base
57
- - codellama/CodeLlama-7b-hf
58
  - GPT 4o mini
 
 
59
 
60
  On the following parameters
61
- - Compilation(%)--Percentage of generated contracts that compile successfully without modification.
62
- - OpenZeppelin Compliance(%)--Adherence to OpenZeppelin library usage and standards.
63
- - Gas Efficiency(%)--Degree of gas optimization based on Slither’s suggestions.
64
- - Security(%)--Percentage of code free from common vulnerabilities detected by Slither.
 
65
 
66
  ## Benchmark
67
- The figure below presents a detailed comparison of the models across all evaluation criteria
68
  ![Benchmark](Benchmark.png)
69
 
 
 
 
 
 
 
 
70
 
71
  # Uses
72
  ### Direct Use
@@ -104,7 +110,7 @@ Use the code below to get started with the model.
104
  ```Python
105
  from transformers import AutoModelForCausalLM, AutoTokenizer
106
 
107
- modelpath = "ChainGPT/SolidityLLM"
108
 
109
  tokenizer = AutoTokenizer.from_pretrained(modelpath)
110
  model = AutoModelForCausalLM.from_pretrained(modelpath)
@@ -129,7 +135,7 @@ from threading import Thread
129
  from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
130
 
131
  model = AutoModelForCausalLM.from_pretrained(
132
- "Chain-GPT/Solidity-LLM",
133
  torch_dtype=torch.bfloat16,
134
  device_map="cuda"
135
  )
@@ -272,7 +278,7 @@ contract DecentralizedLibrary is Ownable(msg.sender) {
272
  # Evaluation Matrics
273
  To evaluate the performance of our fine-tuned LLM specialized in Solidity smart contract generation, we used **[Slither](https://github.com/crytic/slither)**, a static analysis framework widely used for analyzing Solidity code.
274
 
275
- We focused on four key evaluation criteria:
276
 
277
  - **Compilation Success Rate**
278
  We measured the percentage of generated smart contracts that compile successfully without modification. This helps assess the syntactic and structural correctness of the model outputs.
@@ -286,8 +292,17 @@ Using Slither’s gas optimization analysis, we identified areas in the generate
286
  - **Security Vulnerabilities**
287
  We analyzed each contract for known security vulnerabilities using Slither’s built-in detectors. We recorded the number and severity of the vulnerabilities detected, providing a measure of the security quality of the model’s outputs.
288
 
289
- These evaluation metrics help quantify the practical usability and reliability of the generated smart contracts in real-world scenarios.
 
 
 
290
 
291
 
292
  # Summary
293
- Model shows improved understanding and generation capabilities in Solidity when compared to baseline LLMs not trained on Solidity data.
 
 
 
 
 
 
 
27
  - **License:** MIT License
28
  - **Finetuned from model:** Salesforce/codegen-2B-multi
29
 
30
+ ![ChainGPT Logo](logo.png)
31
 
32
 
33
  # Model Details
 
45
  - **Dtype**: bfloat16
46
 
47
  ### Model Sources
48
+ Model is deployed on huggingface space for inference
 
49
  - **Demo:** [Demo On Hugging Face Space](https://huggingface.co/spaces/Chain-GPT/SolidityLLMDemo)
50
 
 
51
  # Model Comparison
52
  We have compared our model with the following models
53
+ - GPT 4.5 Preview
 
 
54
  - GPT 4o mini
55
+ - [Qwen 2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B)
56
+ - [DeepSeek-Coder-7B-Instruct-v1.5](https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruct-v1.5)
57
 
58
  On the following parameters
59
+ - **Compilation(%)** - Percentage of generated contracts that compile successfully without modification.
60
+ - **OpenZeppelin Compliance(%)** - Adherence to OpenZeppelin library usage and standards.
61
+ - **Gas Efficiency(%)** - Degree of gas optimization based on Slither’s suggestions.
62
+ - **Security(%)** - Percentage of code free from common vulnerabilities detected by Slither.
63
+ - **Average Lines of Code** - Average number of non-empty, commented-included lines in generated contracts, indicating verbosity or conciseness
64
 
65
  ## Benchmark
66
+ Below is a figure summarizing the performance of each model across the four evaluation metrics.
67
  ![Benchmark](Benchmark.png)
68
 
69
+ Following obvservation were made regarding Solidity LLM.
70
+ - Highest Compilation Success Rate (~83%), demonstrating strong Solidity syntax and structure generation.
71
+ - Good OpenZeppelin Compliance (~65%), indicating frequent use of standard libraries and contract patterns. While GPT-4.5, being a much larger model, naturally exhibits stronger adherence to OpenZeppelin standards due to its broader training data, Solidity LLM achieves commendable compliance given its smaller size.
72
+ - Top Gas Efficiency (~72%), producing optimized code as evaluated by tools like Slither.
73
+ - Moderate Security Score (~58%), showing acceptable security posture but room for improvement. GPT-4.5 benefits from its scale in handling more security cases.
74
+ - Concise Code (~70% LOC score), generating relatively compact and efficient smart contracts.
75
+
76
 
77
  # Uses
78
  ### Direct Use
 
110
  ```Python
111
  from transformers import AutoModelForCausalLM, AutoTokenizer
112
 
113
+ modelpath = "Chain-GPT/Solidity-LLM"
114
 
115
  tokenizer = AutoTokenizer.from_pretrained(modelpath)
116
  model = AutoModelForCausalLM.from_pretrained(modelpath)
 
135
  from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
136
 
137
  model = AutoModelForCausalLM.from_pretrained(
138
+ "ChainGPT/SolidityLLM",
139
  torch_dtype=torch.bfloat16,
140
  device_map="cuda"
141
  )
 
278
  # Evaluation Matrics
279
  To evaluate the performance of our fine-tuned LLM specialized in Solidity smart contract generation, we used **[Slither](https://github.com/crytic/slither)**, a static analysis framework widely used for analyzing Solidity code.
280
 
281
+ We focused on following key evaluation criteria:
282
 
283
  - **Compilation Success Rate**
284
  We measured the percentage of generated smart contracts that compile successfully without modification. This helps assess the syntactic and structural correctness of the model outputs.
 
292
  - **Security Vulnerabilities**
293
  We analyzed each contract for known security vulnerabilities using Slither’s built-in detectors. We recorded the number and severity of the vulnerabilities detected, providing a measure of the security quality of the model’s outputs.
294
 
295
+ - **Average Lines of Code (LOC)**
296
+ Captures the average number of lines per generated contract, excluding blank lines but including comments. This metric reflects code verbosity or conciseness, and helps gauge implementation completeness versus potential redundancy.
297
+
298
+ These metrics collectively provide a multi-dimensional view of the model’s effectiveness, spanning correctness, efficiency, security, and usability. They are designed to reflect both automated benchmarks and real-world developer expectations.
299
 
300
 
301
  # Summary
302
+ Solidity LLM, despite its compact 2B parameter size, delivers standout performance in generating Solidity smart contracts. It achieved the highest compilation success rate (83%), showcasing robust syntactic and structural understanding. Its strong OpenZeppelin compliance (65%), though slightly behind very large models like GPT-4.5, is impressive given the scale difference, reflecting reliable use of industry-standard patterns and libraries.
303
+
304
+ Further, Solidity LLM ranked highest in gas efficiency (72%), producing optimized code suitable for cost-sensitive deployments. While the security score (58%) indicates room for improvement, the model consistently generated secure-enough contracts for practical use. Its concise output (70% LOC score) also suggests an efficient coding style, balancing brevity with completeness.
305
+
306
+ Overall, Solidity LLM proves to be a resource-efficient, reliable, and well-balanced model for Solidity code generation.
307
+
308
+ Looking ahead, future releases will focus on improving support for newer versions of the Solidity language and OpenZeppelin libraries, enhancing user interaction by enabling contract modifications, expanding compatibility to other languages like Rust, and developing larger models capable of handling longer context windows.