fm-universe
/

deepseek-coder-7b-instruct-v1.5-fma

@@ -1,9 +1,11 @@
 ---
-license: mit
-language:
-- en
 base_model:
 - deepseek-ai/deepseek-coder-7b-instruct-v1.5
 ---
 <p align="center">
@@ -12,19 +14,8 @@ base_model:
 ## Introduction
-We present a fine-tuned model for formal verification tasks. It is fine-tuned in five formal specification languages (Cog, Dafny, Lean4, ACSL, and TLA) on six formal-verification-related tasks:
-- **Requirement Analysis**: given requirements and description of the verification or modeling goals, decomposing the goal into detailed verification steps
-- **Proof/Model Generation**: given requirements and description of the verification or modeling goals, writing formal proofs or models that can be verified by verifier/model checker.
-- **Proof segment generation**
-- **Proof Completion**: complete the given incomplete proofs or models
-- **Proof Infilling**: filling in the middle of the given incomplete proofs or models
-- **Code 2 Proof**: (Currently only support for ACSL whose specification is in form of code annotations) given the code under verification, generate the proof with the specifications
 ## Application Scenario
@@ -57,9 +48,9 @@ You only need to return the TLA formal specification without explanation.
 input_text = """
 An operation `LM_Inner_Rsp(p)` that represents a response process for a given parameter `p`. It satisfies the following conditions:
-  - The control state `octl[p]` is equal to `\"done\"`.
   - The `Reply(p, obuf[p], memInt, memInt')` operation is executed.
-  - The control state `octl` is updated by setting the `p` index of `octl` to `\"rdy\"`.
   - The variables `omem` and `obuf` remain unchanged.
 """
@@ -70,7 +61,8 @@ model = AutoModelForCausalLM.from_pretrained(
 )
 tokenizer = AutoTokenizer.from_pretrained(model_name)
-messages = [{"role": "user", "content": f"{instruct}\n{input_text}"}]
 text = tokenizer.apply_chat_template(
     messages, tokenize=False, add_generation_prompt=True
@@ -101,9 +93,9 @@ You only need to return the TLA formal specification without explanation.
 input_text = """
 An operation `LM_Inner_Rsp(p)` that represents a response process for a given parameter `p`. It satisfies the following conditions:
-  - The control state `octl[p]` is equal to `\"done\"`.
   - The `Reply(p, obuf[p], memInt, memInt')` operation is executed.
-  - The control state `octl` is updated by setting the `p` index of `octl` to `\"rdy\"`.
   - The variables `omem` and `obuf` remain unchanged.
 """
@@ -123,7 +115,8 @@ llm = LLM(
 )
 # Prepare chat messages
-chat_message = [{"role": "user", "content": f"{instruct}\n{input_text}"}]
 # Inference
 responses = llm.chat(chat_message, greed_sampling, use_tqdm=True)
@@ -142,5 +135,4 @@ print(responses[0].outputs[0].text)
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2501.16207},
 }
-```

 ---
 base_model:
 - deepseek-ai/deepseek-coder-7b-instruct-v1.5
+language:
+- en
+license: mit
+pipeline_tag: text-generation
+library_name: transformers
 ---
 <p align="center">
 ## Introduction
+This model, presented in the paper [From Informal to Formal -- Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs](https://hf.co/papers/2501.16207), is a fine-tuned LLM for formal verification tasks.  Trained on 18k high-quality instruction-response pairs across five formal specification languages (Coq, Dafny, Lean4, ACSL, and TLA+), it excels at various sub-tasks including requirement analysis, proof/model generation, and code-to-proof translation (for ACSL).  Interestingly, fine-tuning on this formal data also enhances the model's mathematics, reasoning, and coding capabilities.
 ## Application Scenario
 input_text = """
 An operation `LM_Inner_Rsp(p)` that represents a response process for a given parameter `p`. It satisfies the following conditions:
+  - The control state `octl[p]` is equal to `"done"`.
   - The `Reply(p, obuf[p], memInt, memInt')` operation is executed.
+  - The control state `octl` is updated by setting the `p` index of `octl` to `"rdy"`.
   - The variables `omem` and `obuf` remain unchanged.
 """
 )
 tokenizer = AutoTokenizer.from_pretrained(model_name)
+messages = [{"role": "user", "content": f"{instruct}
+{input_text}"}]
 text = tokenizer.apply_chat_template(
     messages, tokenize=False, add_generation_prompt=True
 input_text = """
 An operation `LM_Inner_Rsp(p)` that represents a response process for a given parameter `p`. It satisfies the following conditions:
+  - The control state `octl[p]` is equal to `"done"`.
   - The `Reply(p, obuf[p], memInt, memInt')` operation is executed.
+  - The control state `octl` is updated by setting the `p` index of `octl` to `"rdy"`.
   - The variables `omem` and `obuf` remain unchanged.
 """
 )
 # Prepare chat messages
+chat_message = [{"role": "user", "content": f"{instruct}
+{input_text}"}]
 # Inference
 responses = llm.chat(chat_message, greed_sampling, use_tqdm=True)
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2501.16207},
 }
+```