Nuo Chen commited on
Commit
4d3b355
·
verified ·
1 Parent(s): 5d037a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +159 -3
README.md CHANGED
@@ -1,3 +1,159 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ license_link: https://huggingface.co/Xtra-Computing/XtraGPT-7B/blob/main/LICENSE
4
+ language:
5
+ - en
6
+ pipeline_tag: text-generation
7
+ base_model: Qwen/Qwen2.5-7B-Instruct
8
+ tags:
9
+ - chat
10
+ library_name: transformers
11
+ ---
12
+ # XtraGPT-7B
13
+
14
+ ## Introduction
15
+
16
+ XtraGPT is a series of LLMs for Human-AI Collaboration on Controllable Scientific Paper Refinement developed by NUS [Xtra Computing Group](https://github.com/Xtra-Computing).
17
+
18
+ ## Requirements
19
+
20
+ The code of Qwen2.5 has been in the latest Hugging face `transformers` and we advise you to use the latest version of `transformers`.
21
+
22
+ ## Quickstart
23
+
24
+ Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
25
+
26
+ ```python
27
+ from transformers import AutoModelForCausalLM, AutoTokenizer
28
+ model_name = "Xtra-Computing/XtraGPT-7B"
29
+ model = AutoModelForCausalLM.from_pretrained(
30
+ model_name,
31
+ torch_dtype="auto",
32
+ device_map="auto"
33
+ )
34
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
35
+
36
+ paper_content="""
37
+ In this paper, we propose a metric-based probing method, namely, CAT-probing, to quantitatively evaluate how CodePTMs Attention scores relate to distances between AST nodes.
38
+ First, to denoise the input code sequence in the original attention scores matrix, we classify the rows/cols by token types that are pre-defined by compilers,
39
+ and then retain tokens whose types have the highest proportion scores to derive a filtered attention matrix (see Figure 1(b)).
40
+ Meanwhile, inspired by the works (Wang et al., 2020; Zhu et al., 2022), we add edges to improve the connectivity of AST and calculate the distances between nodes corresponding to the selected tokens,
41
+ which generates a distance matrix as shown in Figure 1(c). After that, we define CAT-score to measure the matching degree between the filtered attention matrix and the distance matrix.
42
+ Specifically, the point-wise elements of the two matrices are matched if both the two conditions are satisfied:
43
+ 1) the attention score is larger than a threshold; 2) the distance value is smaller than a threshold. If only one condition is reached, the elements are unmatched.
44
+ We calculate the CAT-score by the ratio of the number of matched elements to the summation of matched and unmatched elements.
45
+ Finally, the CAT-score is used to interpret how CodePTMs attend code structure, where a higher score indicates that the model has learned more structural information.
46
+ """
47
+
48
+ selected_content="""
49
+ After that, we define CAT-score to measure the matching degree between the filtered attention matrix and the distance matrix.
50
+ """
51
+
52
+ prompt ="""
53
+ help me redefine cat-score based on the context.
54
+ """
55
+
56
+ content = f"""
57
+ Please improve the selected content based on the following. Act as an expert model for improving articles **PAPER_CONTENT**.\n
58
+ The output needs to answer the **QUESTION** on **SELECTED_CONTENT** in the input. Avoid adding unnecessary length, unrelated details, overclaims, or vague statements.
59
+ Focus on clear, concise, and evidence-based improvements that align with the overall context of the paper.\n
60
+
61
+ <PAPER_CONTENT>
62
+ {paper_content}
63
+ </PAPER_CONTENT>\n
64
+
65
+ <SELECTED_CONTENT>
66
+ {selected_content}
67
+ </SELECTED_CONTENT>\n
68
+
69
+ <QUESTION>
70
+ {prompt}
71
+ </QUESTION>\n
72
+ """
73
+
74
+ messages = [
75
+ {"role": "user", "content": content}
76
+ ]
77
+
78
+ text = tokenizer.apply_chat_template(
79
+ messages,
80
+ tokenize=False,
81
+ add_generation_prompt=True
82
+ )
83
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
84
+ generated_ids = model.generate(
85
+ **model_inputs,
86
+ max_new_tokens=512
87
+ )
88
+ generated_ids = [
89
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
90
+ ]
91
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
92
+ ```
93
+
94
+
95
+ ```python
96
+ from openai import OpenAI
97
+ model_name = "Xtra-Computing/XtraGPT-7B"
98
+ client = OpenAI(
99
+ base_url="http://localhost:8088/v1",
100
+ api_key="sk-1234567890"
101
+ )
102
+
103
+ paper_content="""
104
+ markdown
105
+ """
106
+ selected_content="""
107
+ After that, we define CAT-score to measure the matching degree between the filtered attention matrix and the distance matrix.
108
+ """
109
+
110
+ prompt ="""
111
+ help me redefine cat-score based on the context.
112
+ """
113
+
114
+ content = f"""
115
+ Please improve the selected content based on the following. Act as an expert model for improving articles **PAPER_CONTENT**.\n
116
+ The output needs to answer the **QUESTION** on **SELECTED_CONTENT** in the input. Avoid adding unnecessary length, unrelated details, overclaims, or vague statements.
117
+ Focus on clear, concise, and evidence-based improvements that align with the overall context of the paper.\n
118
+
119
+ <PAPER_CONTENT>
120
+ {paper_content}
121
+ </PAPER_CONTENT>\n
122
+
123
+ <SELECTED_CONTENT>
124
+ {selected_content}
125
+ </SELECTED_CONTENT>\n
126
+
127
+ <QUESTION>
128
+ {prompt}
129
+ </QUESTION>\n
130
+ """
131
+
132
+ response = client.chat.completions.create(
133
+ model="xtragpt",
134
+ messages=[{"role": "user", "content": content}],
135
+ temperature=0.7,
136
+ max_tokens=16384
137
+ )
138
+ print(response.choices[0].message.content)
139
+
140
+ ```
141
+
142
+ ## Citation
143
+
144
+ If you find our work helpful, feel free to give us a cite.
145
+
146
+ ```
147
+ @misc{xtracomputing2025xtraqa,
148
+ title = {XtraQA},
149
+ url = {https://huggingface.co/Xtra-Computing/XtraGPT-7B},
150
+ author = {Xtra Computing Group},
151
+ year = {2025}
152
+ }
153
+ @article{xtracomputing2025xtragpt,
154
+ title={XtraGPT: LLMs for Human-AI Collaboration on Controllable Scientific Paper Refinement},
155
+ author={Xtra Computing Group},
156
+ journal={arXiv preprint arXiv:abcdefg},
157
+ year={2025}
158
+ }
159
+ ```