agentlans commited on
Commit
8087b42
·
verified ·
1 Parent(s): 1c0537e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +129 -3
README.md CHANGED
@@ -1,3 +1,129 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - agentlans/questionizer
5
+ - agentlans/text-sft-questions-answers-only
6
+ language:
7
+ - en
8
+ base_model:
9
+ - google/flan-t5-small
10
+ pipeline_tag: text-generation
11
+ tags:
12
+ - question
13
+ - answer
14
+ ---
15
+ # FLAN T5 Small Questionizer
16
+
17
+ This model converts declarative statements into questions.
18
+
19
+ **Example:**
20
+ **Input:** The sun rises in the east and sets in the west.
21
+ **Output:** Where does the sun rise and set?
22
+
23
+ ## Usage
24
+
25
+ ```python
26
+ from transformers import pipeline
27
+
28
+ # Load the model
29
+ questionizer = pipeline("text2text-generation", model="agentlans/flan-t5-small-questionizer")
30
+
31
+ # Convert a statement into a question
32
+ statement = "Water covers approximately 71% of the Earth's surface, making it the most abundant substance on the planet's exterior."
33
+ question = questionizer(statement)[0]['generated_text']
34
+
35
+ print(question)
36
+ # Output: What percentage of the Earth's surface does water cover?
37
+ ```
38
+
39
+ ## Examples
40
+
41
+ <details>
42
+ <summary>Click here for simple sentence examples</summary>
43
+
44
+ **Input:** The sun rises in the east and sets in the west.
45
+ **Output:** Where does the sun rise and set?
46
+
47
+ **Input:** Python is a popular programming language for beginners.
48
+ **Output:** What is a popular programming language for beginners?
49
+
50
+ **Input:** Elephants are the largest land animals on Earth.
51
+ **Output:** What are the largest land animals on Earth?
52
+
53
+ **Input:** Rainbows appear when sunlight passes through raindrops.
54
+ **Output:** When do rainbows appear?
55
+
56
+ **Input:** Saturn has beautiful rings made of ice and rock.
57
+ **Output:** What is the shape of Saturn's rings?
58
+
59
+ **Input:** Coffee is enjoyed by millions of people every morning.
60
+ **Output:** How many people enjoy coffee every morning?
61
+
62
+ **Input:** Mount Everest is the highest mountain in the world.
63
+ **Output:** What is the highest mountain in the world?
64
+
65
+ **Input:** Honeybees communicate through a dance called the waggle.
66
+ **Output:** How do honeybees communicate?
67
+
68
+ **Input:** Penguins live in cold climates and cannot fly.
69
+ **Output:** Where do Penguins live and cannot fly?
70
+
71
+ **Input:** Artists use different colors and shapes to express ideas.
72
+ **Output:** What do artists use to express ideas?
73
+ </details>
74
+
75
+ <details>
76
+ <summary>Click here for complex sentence examples</summary>
77
+
78
+ These sentences were randomly selected from the [agentlans/high-quality-english-sentences](https://huggingface.co/datasets/agentlans/high-quality-english-sentences) dataset.
79
+
80
+ **Input:** Want to know what takes up the most space on your Hard Drive?
81
+ **Output:** What do you want to know about the most space on your Hard Drive?
82
+
83
+ **Input:** DISA officials say Defense must embrace mobility management tools to ensure military users don't lag behind the private sector in secure mobile computing capability worldwide.
84
+ **Output:** What do DISA officials say Defense must do to ensure military users don't lag behind the private sector in secure mobile computing capability worldwide?
85
+
86
+ **Input:** This is why learning a healthy nutritious eating pattern and adopting it for life, that is, lifestyle changes has a significant impact on our ability to reach and maintain a healthy weight.
87
+ **Output:** Why is it important to learn a healthy nutritious eating pattern and adopt it for life?
88
+
89
+ **Input:** In 1933, Bartlett was chosen to create the very first gift print, a color etching called 'Java'.
90
+ **Output:** What color etching was Bartlett chosen to create in 1933?
91
+
92
+ **Input:** In the United States, NERC petitions the Federal Energy Regulatory Commission (FERC) for approval of standards.
93
+ **Output:** What does NERC petition the Federal Energy Regulatory Commission (FERC) for in the United States?
94
+
95
+ **Input:** The majority of large businesses are public companies, with government enterprises, not-for-profit organisations and superannuation funds also represented.
96
+ **Output:** What is the majority of large businesses, and what types of organizations are represented?
97
+
98
+ **Input:** Because blueberries need moisture close to the soil surface, it is important to mulch them well.
99
+ **Output:** Why is it important to mulch blueberries well?
100
+
101
+ **Input:** One of the victims was a man but the other could not be determined.
102
+ **Output:** Who was one of the victims, and how was the other determined?
103
+
104
+ **Input:** The statute gives States and local educational agencies significant flexibility in how they direct resources and tailor interventions to the needs of individual schools identified for improvement.
105
+ **Output:** What flexibility does the statute provide for States and local educational agencies?
106
+
107
+ **Input:** Similar legislation would allay any hesitancy on the par of the banks in sharing cyber threat information with the government, Tunstall suggests.
108
+ **Output:** What would similar legislation allay in sharing cyber threat information with the government, according to Tunstall?
109
+ </details>
110
+
111
+ ## Limitations
112
+
113
+ * The model works best with statements that provide enough context. Short or vague sentences may lead to hallucinated or unrelated questions. Example:
114
+
115
+ **Input:** No.
116
+ **Output:** Is there a requirement for a person to have a copy of a book in a library?
117
+
118
+ * Not all statements are suitable for question generation. Some inputs may produce awkward questions or questions that do not match the intended meaning.
119
+
120
+ ### Tips for Better Results
121
+
122
+ 1. **Use clear, informative statements:** Include enough context so the model can generate a meaningful question.
123
+ 2. **Prefer factual sentences:** The model performs better on statements that contain concrete information (dates, quantities, events, definitions).
124
+ 3. **Avoid extremely short inputs:** Single words or one-word answers rarely produce useful questions.
125
+ 4. **Check generated questions:** While the model is powerful, review outputs for accuracy and relevance, especially for educational or professional use.
126
+
127
+ ## Licence
128
+
129
+ Apache 2.0