model documentation
Browse files
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,163 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
             | 
| 3 | 
            +
            tags:
         | 
| 4 | 
            +
            - text-generation
         | 
| 5 | 
            +
            ---
         | 
| 6 | 
            +
            # Model Card for GPT-J-6B-Skein
         | 
| 7 | 
            +
              
         | 
| 8 | 
            +
            # Model Details
         | 
| 9 | 
            +
             
         | 
| 10 | 
            +
            ## Model Description
         | 
| 11 | 
            +
             
         | 
| 12 | 
            +
             
         | 
| 13 | 
            +
            - **Developed by:** KoboldAI
         | 
| 14 | 
            +
            - **Shared by [Optional]:** More information needed
         | 
| 15 | 
            +
            - **Model type:** Text Generation
         | 
| 16 | 
            +
            - **Language(s) (NLP):** More information needed
         | 
| 17 | 
            +
            - **License:** More information needed
         | 
| 18 | 
            +
            - **Related Models:** [GPT-J 6B](https://huggingface.co/EleutherAI/gpt-j-6B?text=My+name+is+Mariama%2C+my+favorite)
         | 
| 19 | 
            +
              - **Parent Model:**  GPT-J
         | 
| 20 | 
            +
            - **Resources for more information:** 
         | 
| 21 | 
            +
                - [GitHub Repo](https://github.com/kingoflolz/mesh-transformer-jax)
         | 
| 22 | 
            +
                - [Associated Model Doc](https://huggingface.co/docs/transformers/main/en/model_doc/gptj#transformers.GPTJForCausalLM)
         | 
| 23 | 
            +
             
         | 
| 24 | 
            +
            # Uses
         | 
| 25 | 
            +
             
         | 
| 26 | 
            +
             
         | 
| 27 | 
            +
            ## Direct Use
         | 
| 28 | 
            +
             
         | 
| 29 | 
            +
            This model can be used for the task of text generation
         | 
| 30 | 
            +
             
         | 
| 31 | 
            +
            ## Downstream Use [Optional]
         | 
| 32 | 
            +
             
         | 
| 33 | 
            +
            More information needed
         | 
| 34 | 
            +
             
         | 
| 35 | 
            +
            ## Out-of-Scope Use
         | 
| 36 | 
            +
             
         | 
| 37 | 
            +
            The model should not be used to intentionally create hostile or alienating environments for people.
         | 
| 38 | 
            +
             
         | 
| 39 | 
            +
            # Bias, Risks, and Limitations
         | 
| 40 | 
            +
            The core functionality of GPT-J is taking a string of text and predicting the next token. While language models are widely used for tasks other than this, there are a lot of unknowns with this work. When prompting GPT-J it is important to remember that the statistically most likely next token is often not the token that produces the most "accurate" text. Never depend upon GPT-J to produce factually accurate output.
         | 
| 41 | 
            +
            GPT-J was trained on the Pile, a dataset known to contain profanity, lewd, and otherwise abrasive language. Depending upon use case GPT-J may produce socially unacceptable text. See Sections 5 and 6 of the Pile paper for a more detailed analysis of the biases in the Pile.
         | 
| 42 | 
            +
            As with all language models, it is hard to predict in advance how GPT-J will respond to particular prompts and offensive content may occur without warning. We recommend having a human curate or filter the outputs before releasing them, both to censor undesirable content and to improve the quality of the results.
         | 
| 43 | 
            +
             
         | 
| 44 | 
            +
            See the [GPT-J 6B model card](https://huggingface.co/EleutherAI/gpt-j-6B?text=My+name+is+Mariama%2C+my+favorite) for more information.
         | 
| 45 | 
            +
             
         | 
| 46 | 
            +
            ## Recommendations
         | 
| 47 | 
            +
             
         | 
| 48 | 
            +
            Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
         | 
| 49 | 
            +
             
         | 
| 50 | 
            +
             
         | 
| 51 | 
            +
            # Training Details
         | 
| 52 | 
            +
             
         | 
| 53 | 
            +
            ## Training Data
         | 
| 54 | 
            +
             
         | 
| 55 | 
            +
            More information needed
         | 
| 56 | 
            +
             
         | 
| 57 | 
            +
            ## Training Procedure
         | 
| 58 | 
            +
             
         | 
| 59 | 
            +
             
         | 
| 60 | 
            +
            ### Preprocessing
         | 
| 61 | 
            +
             
         | 
| 62 | 
            +
            More information needed
         | 
| 63 | 
            +
             
         | 
| 64 | 
            +
            ### Speeds, Sizes, Times
         | 
| 65 | 
            +
             
         | 
| 66 | 
            +
            More information needed
         | 
| 67 | 
            +
             
         | 
| 68 | 
            +
            # Evaluation
         | 
| 69 | 
            +
             
         | 
| 70 | 
            +
             
         | 
| 71 | 
            +
            ## Testing Data, Factors & Metrics
         | 
| 72 | 
            +
             
         | 
| 73 | 
            +
            ### Testing Data
         | 
| 74 | 
            +
             
         | 
| 75 | 
            +
            More information needed
         | 
| 76 | 
            +
             
         | 
| 77 | 
            +
            ### Factors
         | 
| 78 | 
            +
             
         | 
| 79 | 
            +
             
         | 
| 80 | 
            +
            ### Metrics
         | 
| 81 | 
            +
             
         | 
| 82 | 
            +
            More information needed
         | 
| 83 | 
            +
            ## Results 
         | 
| 84 | 
            +
             
         | 
| 85 | 
            +
            More information needed
         | 
| 86 | 
            +
             
         | 
| 87 | 
            +
            # Model Examination
         | 
| 88 | 
            +
             
         | 
| 89 | 
            +
            More information needed
         | 
| 90 | 
            +
             
         | 
| 91 | 
            +
            # Environmental Impact
         | 
| 92 | 
            +
             
         | 
| 93 | 
            +
             
         | 
| 94 | 
            +
            Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
         | 
| 95 | 
            +
             
         | 
| 96 | 
            +
            - **Hardware Type:** More information needed
         | 
| 97 | 
            +
            - **Hours used:** More information needed
         | 
| 98 | 
            +
            - **Cloud Provider:** More information needed
         | 
| 99 | 
            +
            - **Compute Region:** More information needed
         | 
| 100 | 
            +
            - **Carbon Emitted:** More information needed
         | 
| 101 | 
            +
             
         | 
| 102 | 
            +
            # Technical Specifications [optional]
         | 
| 103 | 
            +
             
         | 
| 104 | 
            +
            ## Model Architecture and Objective
         | 
| 105 | 
            +
             
         | 
| 106 | 
            +
            More information needed
         | 
| 107 | 
            +
             
         | 
| 108 | 
            +
            ## Compute Infrastructure
         | 
| 109 | 
            +
             
         | 
| 110 | 
            +
            More information needed
         | 
| 111 | 
            +
             
         | 
| 112 | 
            +
            ### Hardware
         | 
| 113 | 
            +
             
         | 
| 114 | 
            +
            More information needed
         | 
| 115 | 
            +
             
         | 
| 116 | 
            +
            ### Software
         | 
| 117 | 
            +
            More information needed
         | 
| 118 | 
            +
             
         | 
| 119 | 
            +
            # Citation
         | 
| 120 | 
            +
             
         | 
| 121 | 
            +
             
         | 
| 122 | 
            +
            **BibTeX:**
         | 
| 123 | 
            +
             ```
         | 
| 124 | 
            +
            @misc{mesh-transformer-jax,
         | 
| 125 | 
            +
              author = {Wang, Ben},
         | 
| 126 | 
            +
              title = {{Mesh-Transformer-JAX: Model-Parallel Implementation of Transformer Language Model with JAX}},
         | 
| 127 | 
            +
              howpublished = {\url{https://github.com/kingoflolz/mesh-transformer-jax}},
         | 
| 128 | 
            +
              year = 2021,
         | 
| 129 | 
            +
              month = May
         | 
| 130 | 
            +
            }
         | 
| 131 | 
            +
            ```
         | 
| 132 | 
            +
             
         | 
| 133 | 
            +
            # Glossary [optional]
         | 
| 134 | 
            +
            More information needed
         | 
| 135 | 
            +
             
         | 
| 136 | 
            +
            # More Information [optional]
         | 
| 137 | 
            +
             
         | 
| 138 | 
            +
            More information needed
         | 
| 139 | 
            +
             
         | 
| 140 | 
            +
            # Model Card Authors [optional]
         | 
| 141 | 
            +
             
         | 
| 142 | 
            +
             
         | 
| 143 | 
            +
            KoboldAI in collaboration with Ezi Ozoani and the Hugging Face team
         | 
| 144 | 
            +
             
         | 
| 145 | 
            +
            # Model Card Contact
         | 
| 146 | 
            +
             
         | 
| 147 | 
            +
            More information needed
         | 
| 148 | 
            +
             
         | 
| 149 | 
            +
            # How to Get Started with the Model
         | 
| 150 | 
            +
             
         | 
| 151 | 
            +
            Use the code below to get started with the model.
         | 
| 152 | 
            +
             
         | 
| 153 | 
            +
            <details>
         | 
| 154 | 
            +
            <summary> Click to expand </summary>
         | 
| 155 | 
            +
             | 
| 156 | 
            +
            ```python
         | 
| 157 | 
            +
            from transformers import AutoTokenizer, AutoModelForCausalLM
         | 
| 158 | 
            +
             
         | 
| 159 | 
            +
            tokenizer = AutoTokenizer.from_pretrained("KoboldAI/GPT-J-6B-Skein")
         | 
| 160 | 
            +
             
         | 
| 161 | 
            +
            model = AutoModelForCausalLM.from_pretrained("KoboldAI/GPT-J-6B-Skein")
         | 
| 162 | 
            +
            ```
         | 
| 163 | 
            +
            </details>
         | 

