Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,129 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
datasets:
|
| 4 |
+
- agentlans/questionizer
|
| 5 |
+
- agentlans/text-sft-questions-answers-only
|
| 6 |
+
language:
|
| 7 |
+
- en
|
| 8 |
+
base_model:
|
| 9 |
+
- google/flan-t5-small
|
| 10 |
+
pipeline_tag: text-generation
|
| 11 |
+
tags:
|
| 12 |
+
- question
|
| 13 |
+
- answer
|
| 14 |
+
---
|
| 15 |
+
# FLAN T5 Small Questionizer
|
| 16 |
+
|
| 17 |
+
This model converts declarative statements into questions.
|
| 18 |
+
|
| 19 |
+
**Example:**
|
| 20 |
+
**Input:** The sun rises in the east and sets in the west.
|
| 21 |
+
**Output:** Where does the sun rise and set?
|
| 22 |
+
|
| 23 |
+
## Usage
|
| 24 |
+
|
| 25 |
+
```python
|
| 26 |
+
from transformers import pipeline
|
| 27 |
+
|
| 28 |
+
# Load the model
|
| 29 |
+
questionizer = pipeline("text2text-generation", model="agentlans/flan-t5-small-questionizer")
|
| 30 |
+
|
| 31 |
+
# Convert a statement into a question
|
| 32 |
+
statement = "Water covers approximately 71% of the Earth's surface, making it the most abundant substance on the planet's exterior."
|
| 33 |
+
question = questionizer(statement)[0]['generated_text']
|
| 34 |
+
|
| 35 |
+
print(question)
|
| 36 |
+
# Output: What percentage of the Earth's surface does water cover?
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
## Examples
|
| 40 |
+
|
| 41 |
+
<details>
|
| 42 |
+
<summary>Click here for simple sentence examples</summary>
|
| 43 |
+
|
| 44 |
+
**Input:** The sun rises in the east and sets in the west.
|
| 45 |
+
**Output:** Where does the sun rise and set?
|
| 46 |
+
|
| 47 |
+
**Input:** Python is a popular programming language for beginners.
|
| 48 |
+
**Output:** What is a popular programming language for beginners?
|
| 49 |
+
|
| 50 |
+
**Input:** Elephants are the largest land animals on Earth.
|
| 51 |
+
**Output:** What are the largest land animals on Earth?
|
| 52 |
+
|
| 53 |
+
**Input:** Rainbows appear when sunlight passes through raindrops.
|
| 54 |
+
**Output:** When do rainbows appear?
|
| 55 |
+
|
| 56 |
+
**Input:** Saturn has beautiful rings made of ice and rock.
|
| 57 |
+
**Output:** What is the shape of Saturn's rings?
|
| 58 |
+
|
| 59 |
+
**Input:** Coffee is enjoyed by millions of people every morning.
|
| 60 |
+
**Output:** How many people enjoy coffee every morning?
|
| 61 |
+
|
| 62 |
+
**Input:** Mount Everest is the highest mountain in the world.
|
| 63 |
+
**Output:** What is the highest mountain in the world?
|
| 64 |
+
|
| 65 |
+
**Input:** Honeybees communicate through a dance called the waggle.
|
| 66 |
+
**Output:** How do honeybees communicate?
|
| 67 |
+
|
| 68 |
+
**Input:** Penguins live in cold climates and cannot fly.
|
| 69 |
+
**Output:** Where do Penguins live and cannot fly?
|
| 70 |
+
|
| 71 |
+
**Input:** Artists use different colors and shapes to express ideas.
|
| 72 |
+
**Output:** What do artists use to express ideas?
|
| 73 |
+
</details>
|
| 74 |
+
|
| 75 |
+
<details>
|
| 76 |
+
<summary>Click here for complex sentence examples</summary>
|
| 77 |
+
|
| 78 |
+
These sentences were randomly selected from the [agentlans/high-quality-english-sentences](https://huggingface.co/datasets/agentlans/high-quality-english-sentences) dataset.
|
| 79 |
+
|
| 80 |
+
**Input:** Want to know what takes up the most space on your Hard Drive?
|
| 81 |
+
**Output:** What do you want to know about the most space on your Hard Drive?
|
| 82 |
+
|
| 83 |
+
**Input:** DISA officials say Defense must embrace mobility management tools to ensure military users don't lag behind the private sector in secure mobile computing capability worldwide.
|
| 84 |
+
**Output:** What do DISA officials say Defense must do to ensure military users don't lag behind the private sector in secure mobile computing capability worldwide?
|
| 85 |
+
|
| 86 |
+
**Input:** This is why learning a healthy nutritious eating pattern and adopting it for life, that is, lifestyle changes has a significant impact on our ability to reach and maintain a healthy weight.
|
| 87 |
+
**Output:** Why is it important to learn a healthy nutritious eating pattern and adopt it for life?
|
| 88 |
+
|
| 89 |
+
**Input:** In 1933, Bartlett was chosen to create the very first gift print, a color etching called 'Java'.
|
| 90 |
+
**Output:** What color etching was Bartlett chosen to create in 1933?
|
| 91 |
+
|
| 92 |
+
**Input:** In the United States, NERC petitions the Federal Energy Regulatory Commission (FERC) for approval of standards.
|
| 93 |
+
**Output:** What does NERC petition the Federal Energy Regulatory Commission (FERC) for in the United States?
|
| 94 |
+
|
| 95 |
+
**Input:** The majority of large businesses are public companies, with government enterprises, not-for-profit organisations and superannuation funds also represented.
|
| 96 |
+
**Output:** What is the majority of large businesses, and what types of organizations are represented?
|
| 97 |
+
|
| 98 |
+
**Input:** Because blueberries need moisture close to the soil surface, it is important to mulch them well.
|
| 99 |
+
**Output:** Why is it important to mulch blueberries well?
|
| 100 |
+
|
| 101 |
+
**Input:** One of the victims was a man but the other could not be determined.
|
| 102 |
+
**Output:** Who was one of the victims, and how was the other determined?
|
| 103 |
+
|
| 104 |
+
**Input:** The statute gives States and local educational agencies significant flexibility in how they direct resources and tailor interventions to the needs of individual schools identified for improvement.
|
| 105 |
+
**Output:** What flexibility does the statute provide for States and local educational agencies?
|
| 106 |
+
|
| 107 |
+
**Input:** Similar legislation would allay any hesitancy on the par of the banks in sharing cyber threat information with the government, Tunstall suggests.
|
| 108 |
+
**Output:** What would similar legislation allay in sharing cyber threat information with the government, according to Tunstall?
|
| 109 |
+
</details>
|
| 110 |
+
|
| 111 |
+
## Limitations
|
| 112 |
+
|
| 113 |
+
* The model works best with statements that provide enough context. Short or vague sentences may lead to hallucinated or unrelated questions. Example:
|
| 114 |
+
|
| 115 |
+
**Input:** No.
|
| 116 |
+
**Output:** Is there a requirement for a person to have a copy of a book in a library?
|
| 117 |
+
|
| 118 |
+
* Not all statements are suitable for question generation. Some inputs may produce awkward questions or questions that do not match the intended meaning.
|
| 119 |
+
|
| 120 |
+
### Tips for Better Results
|
| 121 |
+
|
| 122 |
+
1. **Use clear, informative statements:** Include enough context so the model can generate a meaningful question.
|
| 123 |
+
2. **Prefer factual sentences:** The model performs better on statements that contain concrete information (dates, quantities, events, definitions).
|
| 124 |
+
3. **Avoid extremely short inputs:** Single words or one-word answers rarely produce useful questions.
|
| 125 |
+
4. **Check generated questions:** While the model is powerful, review outputs for accuracy and relevance, especially for educational or professional use.
|
| 126 |
+
|
| 127 |
+
## Licence
|
| 128 |
+
|
| 129 |
+
Apache 2.0
|