Qwickly forging AGI, enhancing intelligence.

OFASys: Enabling Multitask Learning with One Line of Code!

Intro Generalist Models are hot! We all see an opportunity towards a real generalist model by multimodal multitask learning. We previously release an opensourced unified multimodal pretrained model OFA for this goal. However, we actually met a lot of difficulties in our implementation. For example, it is hard to set up multiple tasks concerning multiple modalities, and it is hard to organize multitask learning, e.g., how to batchify your data and how to make your training stable....

December 28, 2022 · 6 min · 1108 words · Qwen Team

Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese

CLIP1 is a phenomenal playmaker in vision and multimodal representation learning. It plays not only as a foundation model but also a bridge between vision and language. It has triggered a series of research in different fields, especially text-to-image generation. However, we find that there is a necessity for a language-specific CLIP for applications, especially cross-modal retrieval, and there is no opensourced Chinese CLIP with good performance. We therefore launched this project to promote the Chinese multimodal representation learning....

December 24, 2022 · 4 min · 850 words · Qwen Team