Update README.md
Browse files
README.md
CHANGED
|
@@ -91,22 +91,13 @@ Our training script was built on top of the official training script that we pro
|
|
| 91 |
You can refer to [this script](https://github.com/patil-suraj/muse-experiments/blob/f71e7e79af24509ddb4e1b295a1d0ef8d8758dc9/ctrlnet/train_controlnet_webdataset.py) for full discolsure.
|
| 92 |
|
| 93 |
#### Training data
|
| 94 |
-
|
| 95 |
-
It was then further trained for 20,000 steps on laion 6a resized to a max minimum dimension of 1024 and
|
| 96 |
-
then filtered to contain only minimum 1024 images. We found the further high resolution finetuning was
|
| 97 |
-
necessary for image quality.
|
| 98 |
|
| 99 |
#### Compute
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
#### Batch size
|
| 103 |
-
Data parallel with a single gpu batch size of 8 for a total batch size of 64.
|
| 104 |
-
|
| 105 |
-
#### Hyper Parameters
|
| 106 |
-
Constant learning rate of 1e-4 scaled by batch size for total learning rate of 64e-4
|
| 107 |
|
| 108 |
#### Mixed precision
|
| 109 |
-
|
| 110 |
|
| 111 |
#### Additional notes
|
| 112 |
|
|
|
|
| 91 |
You can refer to [this script](https://github.com/patil-suraj/muse-experiments/blob/f71e7e79af24509ddb4e1b295a1d0ef8d8758dc9/ctrlnet/train_controlnet_webdataset.py) for full discolsure.
|
| 92 |
|
| 93 |
#### Training data
|
| 94 |
+
The model was trained on 3M images from LAION aesthetic 6 plus subset, with batch size of 256 for 50k steps with constant learning rate of 3e-5.
|
|
|
|
|
|
|
|
|
|
| 95 |
|
| 96 |
#### Compute
|
| 97 |
+
One 8xA100 machine
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 98 |
|
| 99 |
#### Mixed precision
|
| 100 |
+
FP16
|
| 101 |
|
| 102 |
#### Additional notes
|
| 103 |
|