Commit 
							
							Β·
						
						d023078
	
1
								Parent(s):
							
							b657463
								
Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -1,50 +1,56 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 1 | 
             
            ---
         | 
| 2 | 
            -
            tags:
         | 
| 3 | 
            -
            - generated_from_trainer
         | 
| 4 | 
            -
            model-index:
         | 
| 5 | 
            -
            - name: t5-base-emolm
         | 
| 6 | 
            -
              results: []
         | 
| 7 | 
            -
            ---
         | 
| 8 | 
            -
             | 
| 9 | 
            -
            <!-- This model card has been generated automatically according to the information the Trainer had access to. You
         | 
| 10 | 
            -
            should probably proofread and complete it, then remove this comment. -->
         | 
| 11 | 
            -
             | 
| 12 | 
            -
            # t5-base-emolm
         | 
| 13 | 
            -
             | 
| 14 | 
            -
            This model is a fine-tuned version of [saved_models/t5-base](https://huggingface.co/saved_models/t5-base) on an unknown dataset.
         | 
| 15 | 
            -
             | 
| 16 | 
            -
            ## Model description
         | 
| 17 | 
            -
             | 
| 18 | 
            -
            More information needed
         | 
| 19 | 
            -
             | 
| 20 | 
            -
            ## Intended uses & limitations
         | 
| 21 | 
            -
             | 
| 22 | 
            -
            More information needed
         | 
| 23 | 
            -
             | 
| 24 | 
            -
            ## Training and evaluation data
         | 
| 25 | 
            -
             | 
| 26 | 
            -
            More information needed
         | 
| 27 | 
            -
             | 
| 28 | 
            -
            ## Training procedure
         | 
| 29 | 
            -
             | 
| 30 | 
            -
            ### Training hyperparameters
         | 
| 31 | 
            -
             | 
| 32 | 
            -
            The following hyperparameters were used during training:
         | 
| 33 | 
            -
            - learning_rate: 0.0003
         | 
| 34 | 
            -
            - train_batch_size: 32
         | 
| 35 | 
            -
            - eval_batch_size: 128
         | 
| 36 | 
            -
            - seed: 42
         | 
| 37 | 
            -
            - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
         | 
| 38 | 
            -
            - lr_scheduler_type: linear
         | 
| 39 | 
            -
            - num_epochs: 2.0
         | 
| 40 | 
            -
             | 
| 41 | 
            -
            ### Training results
         | 
| 42 | 
            -
             | 
| 43 | 
            -
             | 
| 44 | 
            -
             | 
| 45 | 
            -
            ### Framework versions
         | 
| 46 |  | 
| 47 | 
            -
             | 
| 48 | 
            -
             | 
| 49 | 
            -
            -  | 
| 50 | 
            -
             | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            datasets:
         | 
| 2 | 
            +
              - KomeijiForce/Text2Emoji
         | 
| 3 | 
            +
            language:
         | 
| 4 | 
            +
              - en
         | 
| 5 | 
            +
            metrics:
         | 
| 6 | 
            +
              - bertscore
         | 
| 7 | 
            +
            pipeline_tag: text2text-generation
         | 
| 8 | 
             
            ---
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 9 |  | 
| 10 | 
            +
            # EmojiLM
         | 
| 11 | 
            +
             | 
| 12 | 
            +
            This is a [T5](https://huggingface.co/t5-base) model pre-trained on the [Text2Emoji](https://huggingface.co/datasets/KomeijiForce/Text2Emoji) dataset to translate setences into series of emojis.
         | 
| 13 | 
            +
             | 
| 14 | 
            +
            For instance, "I love pizza" will be translated into "ππ".
         | 
| 15 | 
            +
             | 
| 16 | 
            +
            An example implementation for translation:
         | 
| 17 | 
            +
             | 
| 18 | 
            +
            ```python
         | 
| 19 | 
            +
            from transformers import T5Tokenizer, T5ForConditionalGeneration
         | 
| 20 | 
            +
             | 
| 21 | 
            +
            path = "saved_models/t5-base-emolm"
         | 
| 22 | 
            +
            tokenizer = T5Tokenizer.from_pretrained(path)
         | 
| 23 | 
            +
            generator = T5ForConditionalGeneration.from_pretrained(path)
         | 
| 24 | 
            +
             | 
| 25 | 
            +
            prefix = "translate into emojis:"
         | 
| 26 | 
            +
            sentence = "I travel to enjoy the taste of sushi!"
         | 
| 27 | 
            +
            inputs = tokenizer(prefix+" "+sentence, return_tensors="pt")
         | 
| 28 | 
            +
            generated_ids = generator.generate(inputs["input_ids"], num_beams=4, do_sample=True, max_length=100)
         | 
| 29 | 
            +
            decoded = tokenizer.decode(generated_ids[0], skip_special_tokens=True).replace(" ", "")
         | 
| 30 | 
            +
            print(decoded)
         | 
| 31 | 
            +
            ```
         | 
| 32 | 
            +
             | 
| 33 | 
            +
            You will probably get some output like "π―π΅π£π±π".
         | 
| 34 | 
            +
             | 
| 35 | 
            +
            If you find this model & dataset resource useful, please consider cite our paper:
         | 
| 36 | 
            +
             | 
| 37 | 
            +
            ```
         | 
| 38 | 
            +
            @article{DBLP:journals/corr/abs-2311-01751,
         | 
| 39 | 
            +
              author       = {Letian Peng and
         | 
| 40 | 
            +
                              Zilong Wang and
         | 
| 41 | 
            +
                              Hang Liu and
         | 
| 42 | 
            +
                              Zihan Wang and
         | 
| 43 | 
            +
                              Jingbo Shang},
         | 
| 44 | 
            +
              title        = {EmojiLM: Modeling the New Emoji Language},
         | 
| 45 | 
            +
              journal      = {CoRR},
         | 
| 46 | 
            +
              volume       = {abs/2311.01751},
         | 
| 47 | 
            +
              year         = {2023},
         | 
| 48 | 
            +
              url          = {https://doi.org/10.48550/arXiv.2311.01751},
         | 
| 49 | 
            +
              doi          = {10.48550/ARXIV.2311.01751},
         | 
| 50 | 
            +
              eprinttype    = {arXiv},
         | 
| 51 | 
            +
              eprint       = {2311.01751},
         | 
| 52 | 
            +
              timestamp    = {Tue, 07 Nov 2023 18:17:14 +0100},
         | 
| 53 | 
            +
              biburl       = {https://dblp.org/rec/journals/corr/abs-2311-01751.bib},
         | 
| 54 | 
            +
              bibsource    = {dblp computer science bibliography, https://dblp.org}
         | 
| 55 | 
            +
            }
         | 
| 56 | 
            +
            ```
         | 
