mahing commited on
Commit
40b98cd
·
verified ·
1 Parent(s): 31ab5e5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -14,8 +14,8 @@ tags:
14
  **IN PROGRESS** <br />
15
 
16
  **Introduction** <br />
17
- LLMs can be used to build out accurate and informative first-person narratives from historical periods, mimicking the language and speech of an era. This in turn can be used to create educational stories from that era to guide listeners through their journey into a specific period in history. This task for an LLM can expand our understanding of culture and language from historical eras in a fun way, which can be used for educational purposes in schools and museums. Using current LLMs for this task would not be very successful as current models are trained on so much data and are not tailored for this specific task, leading to possible anachronisms and inaccuracies in the language it uses and the historical information. Using current models resulted in sub-par narratives even after many different prompt engineering and few-shot prompting methods. The models seemed to perform the worst in terms of using linguistics of an era and creating a story with vivid details, which a custom LLM could improve on. <br />
18
- To successfully fine-tune an LLM for this task, I first picked a suitable base model that created passable narratives with few-shot prompting and had few enough parameters to not require massive amounts of compute for fine-tuning. I chose to use Qwen2.5-1M for this purpose. I then used Gutenberg and other sources to find historical documents that could be used as input data to train the custom Qwen model, matching a synthetically generated narrative to each historical document as the output. This was used as the training data for LoRA, which updated the most relevant parameters for my custom task, leading to strong qualitative results. The historical narratives generated after fine-tuning were much stronger than current LLM results and exceeded expectations. This result also did not worsen the model’s general understanding based on evaluation metrics. If used in schools, this model could create engaging, creative, and informative first-person narratives to build knowledge and interest in history for students.
19
 
20
  **Training Data** <br />
21
  For this task, I utilized the various first-person sources and historical documents from Project Gutenberg as input data, along with manual searching for certain well-known documents. Project Gutenberg’s main goal is to digitize cultural and historical works, thereby including many biographies and memoirs throughout history that would be perfect in teaching an LLM to build out an accurate narrative from the document’s era. The output corresponding to this input data will be a first-person narrative based on the events in the input data. For example, if the input data is the description of a war, the output can be a soldier’s first-person account of their daily life during the war. The main source of my data wrangling was synthetically generating these first-person narratives using an LLM and testing its output using my knowledge and other LLMs to determine the strength of the response. Doing this was a tedious task, and I finished with approximately 900 document-narrative pairs, which I split up into 700 for the training set, and 100 for the validation and test sets using a random seed of 42.
 
14
  **IN PROGRESS** <br />
15
 
16
  **Introduction** <br />
17
+ LLMs can be used to build out accurate and informative first-person narratives from historical periods, mimicking the language and speech of an era. This in turn can be used to create educational stories from that era to guide listeners through their journey into a specific period in history. This task for an LLM can expand our understanding of culture and language from historical eras in a fun way, which can be used for educational purposes in schools and museums. <br />
18
+ To successfully fine-tune an LLM for this task, I first picked a suitable base model that created passable narratives with few-shot prompting and had few enough parameters to not require massive amounts of compute for fine-tuning. I chose to use Qwen2.5-1M for this purpose. I then used Gutenberg and other sources to find historical documents that could be used as input data to train the custom Qwen model, matching a synthetically generated narrative to each historical document. This was used as the training data for LoRA, which updated the most relevant parameters for my custom task. The historical narratives generated after fine-tuning were much stronger than current LLM results and exceeded expectations. If used in schools, this model could create engaging, creative, and informative first-person narratives to build knowledge and interest in history for students.
19
 
20
  **Training Data** <br />
21
  For this task, I utilized the various first-person sources and historical documents from Project Gutenberg as input data, along with manual searching for certain well-known documents. Project Gutenberg’s main goal is to digitize cultural and historical works, thereby including many biographies and memoirs throughout history that would be perfect in teaching an LLM to build out an accurate narrative from the document’s era. The output corresponding to this input data will be a first-person narrative based on the events in the input data. For example, if the input data is the description of a war, the output can be a soldier’s first-person account of their daily life during the war. The main source of my data wrangling was synthetically generating these first-person narratives using an LLM and testing its output using my knowledge and other LLMs to determine the strength of the response. Doing this was a tedious task, and I finished with approximately 900 document-narrative pairs, which I split up into 700 for the training set, and 100 for the validation and test sets using a random seed of 42.